This week, we'll be reading Rudin et al's "Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges":
"Interpretability in machine learning (ML) is crucial for high stakes
decisions and troubleshooting. In this work, we provide fundamental principles
for interpretable ML, and dispel common misunderstandings that dilute the
importance of this crucial topic. We also identify 10 technical challenge areas
in interpretable machine learning and provide history and background on each
problem. Some of these problems are classically important, and some are recent
problems that have arisen in the last few years. These problems are: (1) ... (9)
Characterization of the "Rashomon set" of good models; and (10) Interpretable
reinforcement learning."
This is a long paper, but I'm particularly interested in discussing (9): Rashomon sets of models. In what situations should we expect black-box models (e.g. deep neural networks) to have interpretable counterparts (e.g. sparse logical models / decision trees) with similar performance? The answer to this question will help govern competitive pressures for using/not using interpretable models, and also inform the difficulty of supervising the computation performed by potential future human-level ML systems.
Best,
Alex Turner