[Ai] AI Alignment Reading Group

21 Apr 2022

      Hello everyone,

We’ll be holding the next AI alignment reading group meeting at 2 PM PST
this Friday. Our paper will be “Active Learning Helps Pretrained Models
Learn the Intended Task <https://arxiv.org/abs/2204.08491>”.

Models can fail in unpredictable ways during deployment due to task
...
ambiguity, when multiple behaviors are consistent with the provided
training data. An example is an object classifier trained on red squares
and blue circles: when encountering blue squares, the intended behavior is
undefined. We investigate whether pretrained models are better active
learners, capable of disambiguating between the possible tasks a user may
be trying to specify. Intriguingly, we find that better active learning is
an emergent property of the pretraining process: pretrained models require
up to 5 times fewer labels when using uncertainty-based active learning,
while non-pretrained models see no or even negative benefit. We find these
gains come from an ability to select examples with attributes that
disambiguate the intended behavior, such as rare product categories or
atypical backgrounds. These attributes are far more linearly separable in
pretrained model's representation spaces vs non-pretrained models,
suggesting a possible mechanism for this behavior.
If you're at all interested in building AI systems that can disambiguate
concepts or handle uncertainty, please feel free to join.

All the best,
Quintin Pope

Join Zoom Meeting
https://oregonstate.zoom.us/j/95843260079?pwd=TzZTN0xPaFZrazRGTElud0J1cnJLUT...

Password: 961594

Phone Dial-In Information
+1 971 247 1195 US (Portland)
+1 253 215 8782 US (Tacoma)
+1 301 715 8592 US (Washington DC)

Meeting ID: 958 4326 0079

All the best,
Quintin

[Ai] AI Alignment Reading Group

Pope, Quintin