Machine Learning through the Lens of Data

Park, Sung Min

Author(s)

Park, Sung Min

DownloadThesis PDF (56.70Mb)

Advisor

Mądry, Aleksander

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Many critical challenges in machine learning—e.g., debugging model behavior or selecting good training data—require us to relate outputs of models back to the training data. The goal of predictive data attribution, the focus of this thesis, is to precisely characterize the resulting model behavior as a function of the training data in order to tackle these challenges. In the first part of this thesis, we introduce a framework, datamodeling, for formalizing and constructing effective methods for predictive data attribution. Despite the complexity of modern machine learning systems (e.g., end-to-end training of deep neural networks using stochastic gradient algorithms), we show that we can accurately predict model outputs from simple linear functions of the training data. We then demonstrate that these predictors—which we call datamodels—provide a versatile primitive for various tasks, ranging from predicting the effect of dataset counterfactuals to identifying brittle predictions. Next, to further improve the scalability of data attribution in this framework, we design a new method trak (Tracing with the Randomly-projected After Kernel) that is both effective and computationally tractable for large-scale, differentiable models. By leveraging a kernel approximation and other classic ideas from statistics and algorithm design, we are able to reduce the challenging problem of attributing the original DNN to that of attributing a simpler surrogate. We demonstrate the effectiveness of trak across various modalities and scales: image classifiers trained on ImageNet, vision-language models (CLIP), language models (BERT and mT5), and diffusion models. In the second part of this thesis, we explore applications of this framework developed in the first part: First, we leverage datamodels for the problem of learning algorithm comparison, where the goal is to detect differences between models trained with two different learning algorithms. Our algorithm, ModelDiff, enables us to automatically surface biases that distinguish different learning algorithms by differentiating how they use the same training data. Lastly, we tackle the challenging problem of machine unlearning, wherein the goal is to “unlearn” a small fraction of training data from a trained model. By leveraging the fact that datamodels can accurately approximate the “oracle” predictions, we design a simple finetuning algorithm that allows us to unlearn at a significantly smaller cost than prior methods.

Date issued

2024-09

URI

https://hdl.handle.net/1721.1/159360

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses