AI alignment reading group: Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges

2 Jun 2021

      This week, we'll be reading Rudin et al's "Interpretable Machine Learning:
Fundamental Principles and 10 Grand Challenges":

"Interpretability in machine learning (ML) is crucial for high stakes
decisions and troubleshooting. In this work, we provide fundamental
principles for interpretable ML, and dispel common misunderstandings that
dilute the importance of this crucial topic. We also identify 10 technical
challenge areas in interpretable machine learning and provide history and
background on each problem. Some of these problems are classically
important, and some are recent problems that have arisen in the last few
years. These problems are: (1) ... (9) Characterization of the "Rashomon
set" of good models; and (10) Interpretable reinforcement learning."

This is a long paper, but I'm particularly interested in discussing (9):
Rashomon sets of models. In what situations should we expect black-box
models (e.g. deep neural networks) to have interpretable counterparts (e.g.
sparse logical models / decision trees) with similar performance? The
answer to this question will help govern competitive pressures for
using/not using interpretable models, and also inform the difficulty of
supervising the computation performed by potential future human-level ML
systems.

The paper: https://arxiv.org/abs/2103.11251

We'll meet Friday at 1.
https://oregonstate.zoom.us/j/2739792686?pwd=VkRUeHJkYnhvTzlvZzR6YnZWNERKQT0...

Best,
Alex Turner

Turner, Alex

tags

participants (1)