Understanding the Interplay between Interpretability, Fairness, and Privacy in Machine Learning
In the realm of high-stakes decision-making, machine learning techniques are being increasingly employed. From college admissions to loan attribution and recidivism prediction, these models have the power to shape lives. As a result, it is imperative that the models used can be understood by human users, do not perpetuate discrimination or bias, and maintain the privacy of sensitive information. While interpretability, fairness, and privacy have all been extensively studied as individual concepts, their interplay with one another has largely been neglected. In this Systematization of Knowledge paper, we delve into the literature surrounding the interactions between these three factors, exploring the synergies and tensions that arise. Through our findings, we uncover fundamental conflicts and demonstrate the challenges that arise when trying to balance these requirements while preserving utility. However, we also provide insights into possible methods for reconciling these concerns, emphasizing that with careful design, it is possible to successfully navigate the complex landscape of responsible machine learning.
Abstract:Machine learning techniques are increasingly used for high-stakes decision-making, such as college admissions, loan attribution or recidivism prediction. Thus, it is crucial to ensure that the models learnt can be audited or understood by human users, do not create or reproduce discrimination or bias, and do not leak sensitive information regarding their training data. Indeed, interpretability, fairness and privacy are key requirements for the development of responsible machine learning, and all three have been studied extensively during the last decade. However, they were mainly considered in isolation, while in practice they interplay with each other, either positively or negatively. In this Systematization of Knowledge (SoK) paper, we survey the literature on the interactions between these three desiderata. More precisely, for each pairwise interaction, we summarize the identified synergies and tensions. These findings highlight several fundamental theoretical and empirical conflicts, while also demonstrating that jointly considering these different requirements is challenging when one aims at preserving a high level of utility. To solve this issue, we also discuss possible conciliation mechanisms, showing that a careful design can enable to successfully handle these different concerns in practice.