June / July / August 2024 Reading
Books
- The Immense Journey, Loren Eiseley (1957)
- The Egyptian, Mika Waltari (1945)
- Burn Book, Kara Swisher (2023)
Papers
- Refusal in Language Models is Mediated by a Single Direction (Paper), Arditi et al., 2024
- Evaluating Chunking Strategies for Retrieval, Smith and Troynikov, 2024
- Dropout, Srivastava et al., 2014
- The Llama 3 Herd of Models, Llama Team, 2024
- Segment Anything Model 2, AI at Meta FAIR, 2024
- Unified Training of Universal Time Series Forecasting Transformers, Woo et al. (Salesforce AI), 2024
- Transformers for Image Recognition at Scale (ViT), Dosovitskiy et al., 2021
- MiniCPM-V, Yao et al., 2024
- xGen-MM, Xue et al. (Salesforce AI), 2024
- Hermes 3 Technical Report, Nous Research, 2024
Blogs
- AGI Safety and Alignment at Google Deepmind -- Aug 24 Update
- On the Speed of ViTs and CNNs, Lucas Beyer
- Linear Relationships in the Transformer's Positional Encoding, Timo Denk
- Interpreting Preference Models with SAEs, Riggs and Brinkmann
- Situational Awareness Ch. I-V, Leopold Aschenbrenner
- NNs, Manifolds, and Topology, Chris Olah
- General Intelligence, James Betker
- Compute Multipliers, James Betker
- Einsum is Easy and Useful, Erik Jenner
- Training LLMs at a Startup, Yi Tay
- Getting 50% on ARC-AGI with GPT-4o, Ryan Greenblatt
- Extrinsic Hallucinations in LLMs, Lilian Weng
- Epistemic Calibration, Linus Lee
- Internet Playgrounds, XH
- Memory Handling in 2D Convs, Mani
- Open Source AI is the Path Forward, Mark Zuckerberg
- Segment Anything Model 2 (Blog), Meta AI
- 5 Year Update on Skipping Grad School, Alex Irpan
- Prism: Mapping Interpretable Concepts and Features in a Latent Space of Language, Linus Lee
- Ordinary Life Improvements, Gwern
- It Looks Like You're Trying To Take Over The World, Gwern
- Commodotize Your Complement, Gwern
- The Scaling Hypothesis, Gwern