[Ai] Alignment Reading Group Feedback

12 Jan 2022

      Hello everyone,

Attendance for recent reading groups has dropped off. I've been wondering
if there's an issue with the timeslot or content. Please let me know if you
have any conflicts with the current time (Friday at 2 PM PST). Currently,
reading group content focuses strongly on interpretability. If you'd be
more interested in some other aspect of alignment research, please let me
know.

For this week, we'll meet at 2 PM PST on Friday. We'll be discussing "A
Mathematical Framework for Transformer Circuits
<https://transformer-circuits.pub/2021/framework/index.html>"

In this paper, we attempt to take initial, very preliminary steps towards
...
reverse-engineering transformers.  Given the incredible complexity and size
of modern language models, we have found it most fruitful to start with the
simplest possible models and work our way up from there.  Our aim is to
discover simple algorithmic patterns, motifs, or frameworks that can
subsequently be applied to larger and more complex models.  Specifically,
in this paper we will study transformers with two layers or less which
have only attention blocks – this is in contrast to a large, modern
transformer like GPT-3, which has 96 layers and alternates attention blocks
with MLP blocks.
Join Zoom Meeting
https://oregonstate.zoom.us/j/95843260079?pwd=TzZTN0xPaFZrazRGTElud0J1cnJLUT...

Password: 961594

Phone Dial-In Information
+1 971 247 1195 US (Portland)
+1 253 215 8782 US (Tacoma)
+1 301 715 8592 US (Washington DC)

Meeting ID: 958 4326 0079

All the best,
Quintin

[Ai] Alignment Reading Group Feedback

Pope, Quintin