Show simple item record

dc.contributor.advisorRubinfeld, Ronitt
dc.contributor.authorLange, Jane
dc.date.accessioned2025-11-17T19:07:02Z
dc.date.available2025-11-17T19:07:02Z
dc.date.issued2025-05
dc.date.submitted2025-08-14T19:32:12.772Z
dc.identifier.urihttps://hdl.handle.net/1721.1/163682
dc.description.abstractWe present algorithms for finding two types of objects that explain the classification of a black-box model f : {±1}^d → {±1} on an instance x ∈ {±1}^d . The first is a certificate: a small set of x’s features that in conjunction essentially determines f(x). The second is a counterfactual: a nearest instance x′ for which f(x′) ≠ f(x). We obtain both algorithms via a connection to the problem of implicitly learning decision trees. The implicit nature of this learning task allows for efficient algorithms even when the complexity of f necessitates an intractably large surrogate decision tree. We solve the implicit learning task by bringing together techniques from learning theory, local computation algorithms, and complexity theory. Our approach of “explaining by implicit learning” shares elements of two previously disparate methods for post-hoc explanations, global and local explanations, and we make the case that it enjoys advantages of both. Our certification algorithm runs in time poly(d, C(f)) and outputs a certificate of size poly(C(f)), where C(f) is the “average certificate complexity" of f. Our counterfactual algorithm runs in time S(f)^[O(∆f (x))] ·log d, where S(f) is the sensitivity of f (a discrete analogue of the Lipschitz constant) and ∆f (x) is the distance from x to its nearest counterfactual. We further prove a lower bound of S(f)^[Ω(∆f (x))] + Ω(log d) for finding counterfactuals, thereby showing that the guarantees of our algorithm are essentially optimal.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleExplaining Black-Box Classifiers by Implicitly Learning Decision Trees
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record