September / October / Nov 2024 Reading
Papers
- Sleeper Agents, Hubinger et al. (Anthropic), 2024
- Batch Norm, Ioffe and Szegedy, 2015
- OLMoE Technical Report, Muenninghoff et al. (Ai2), 2024
- Gaussian Error Linear Units, Hendrycks and Gimpel, 2016
- Longformer, Beltagy et al., 2020
- Sequence to Sequence Learning with Neural Networks, Sutskever et al., 2014
- Neural Turing Machines, Graves et al., 2014
- Neural Machine Translation by Jointly Learning to Align and Translate, Bahnadau et al., 2015
- Effective Approaches to Attention-based Neural Machine Translation, Luong et al., 2015
- Jina Embeddings V3, Sturua et al., 2024
- Mechanistically Eliciting Latent Behaviors in LMs, Mack and Turner, 2024
- Effective Data Augmentation with Diffusion Models, Trabucco et al., 2023
- Rethinking Conventional Wisdom in ML: From Generalization to Scaling, Xiao, 2024
- Image Hijacks: Adversarial Images can Control Generative Models at Runtime, Bailey et al., 2024
- Step-by-Step Diffusion: An Elementary Tutorial, Nakkiran et al., 2024
Blogs / Essays
- What Succeeding at AI Safety Will Involve, Sam Bowman
- Notes on Distributed Systems for Young Bloods, Jeff Hodges
- OLMoE and the Hidden Simplicity in Training Better Foundation Models, Nathan Lambert
- The Decade of Deep Learning, Leo Gao
- Approximating KL Divergence, John Schulman
- An Opinionated Guide to ML Research, John Schulman
- The Pentium as a Navajo Weaving, Ken Shirrif
- Understanding LSTMs, Chris Olah
- Augmented RNNs, Olah and Carter
- The Unreasonable Effectiveness of RNNs, Andrej Karpathy
- Machines of Loving Grace, Dario Amodei
- The Optimistic Thought Experiment, Peter Thiel
- The Straussian Moment, Peter Thiel
- Swift Blind Horseman?, Peter Thiel
- Artfintel: VLMs, Finbarr Timbers
- Why are Amplitudes Complex?, Scott Aaronson
- What Happened to BERT & T5?, Yi Tay