MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Explaining Black-Box Classifiers by Implicitly Learning Decision Trees

Author(s)
Lange, Jane
Thumbnail
DownloadThesis PDF (668.6Kb)
Advisor
Rubinfeld, Ronitt
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
We present algorithms for finding two types of objects that explain the classification of a black-box model f : {±1}^d → {±1} on an instance x ∈ {±1}^d . The first is a certificate: a small set of x’s features that in conjunction essentially determines f(x). The second is a counterfactual: a nearest instance x′ for which f(x′) ≠ f(x). We obtain both algorithms via a connection to the problem of implicitly learning decision trees. The implicit nature of this learning task allows for efficient algorithms even when the complexity of f necessitates an intractably large surrogate decision tree. We solve the implicit learning task by bringing together techniques from learning theory, local computation algorithms, and complexity theory. Our approach of “explaining by implicit learning” shares elements of two previously disparate methods for post-hoc explanations, global and local explanations, and we make the case that it enjoys advantages of both. Our certification algorithm runs in time poly(d, C(f)) and outputs a certificate of size poly(C(f)), where C(f) is the “average certificate complexity" of f. Our counterfactual algorithm runs in time S(f)^[O(∆f (x))] ·log d, where S(f) is the sensitivity of f (a discrete analogue of the Lipschitz constant) and ∆f (x) is the distance from x to its nearest counterfactual. We further prove a lower bound of S(f)^[Ω(∆f (x))] + Ω(log d) for finding counterfactuals, thereby showing that the guarantees of our algorithm are essentially optimal.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/163682
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.