AI alignment reading group: Adversarial Examples Are Not Bugs, They Are Features

This Friday, we'll be discussing a fun paper about adversarial examples. "Over the past few years, adversarial examples – or inputs that have been slightly perturbed by an adversary to cause unintended behavior in machine learning systems – have received significant attention in the machine learning community. There has been much work on training models that are not vulnerable to adversarial examples... but all this research does not really confront the fundamental question: why do these adversarial examples arise in the first place?" Blog post: https://gradientscience.org/adv/ Paper: https://arxiv.org/abs/1905.02175 Summaries of paper and counterpoints: https://www.alignmentforum.org/posts/NTwA3J99RPkgmp6jh/an-62-are-adversarial... We'll meet Friday at 1. https://oregonstate.zoom.us/j/2739792686?pwd=VkRUeHJkYnhvTzlvZzR6YnZWNERKQT0... Alex Turner

This week, we'll zoom away from AI alignment to discuss research itself: what is good research? How can we, as graduate students, develop a 'research taste' which allows us to intuit which directions are promising and important? Two short blog posts: https://michaelnielsen.org/blog/archive/000114.html http://colah.github.io/notes/taste/ We'll meet Friday at 1. https://oregonstate.zoom.us/j/2739792686?pwd=VkRUeHJkYnhvTzlvZzR6YnZWNERKQT0... Alex Turner
participants (1)
-
Turner, Alex