We'll again be meeting at 1 PM PST this Friday. We'll discuss
In-context Learning and Induction Heads. This work attempts to understand how GPT transformers are able to adapt to the current linguistic context so effectively. The authors propose a specific mechanism for in-context learning (induction heads) and argue using different approaches that induction heads account for most in-context learning. In particular, they find no evidence of mesa optimization contributing to in-context learning. If you're at all interested in the internal organization or behaviors of transformers, please feel free to attend!