Show simple item record

dc.contributor.advisorMądry, Aleksander
dc.contributor.authorPark, Sung Min
dc.date.accessioned2025-06-09T16:23:57Z
dc.date.available2025-06-09T16:23:57Z
dc.date.issued2024-09
dc.date.submitted2025-03-04T18:33:12.054Z
dc.identifier.urihttps://hdl.handle.net/1721.1/159360
dc.description.abstractMany critical challenges in machine learning—e.g., debugging model behavior or selecting good training data—require us to relate outputs of models back to the training data. The goal of predictive data attribution, the focus of this thesis, is to precisely characterize the resulting model behavior as a function of the training data in order to tackle these challenges. In the first part of this thesis, we introduce a framework, datamodeling, for formalizing and constructing effective methods for predictive data attribution. Despite the complexity of modern machine learning systems (e.g., end-to-end training of deep neural networks using stochastic gradient algorithms), we show that we can accurately predict model outputs from simple linear functions of the training data. We then demonstrate that these predictors—which we call datamodels—provide a versatile primitive for various tasks, ranging from predicting the effect of dataset counterfactuals to identifying brittle predictions. Next, to further improve the scalability of data attribution in this framework, we design a new method trak (Tracing with the Randomly-projected After Kernel) that is both effective and computationally tractable for large-scale, differentiable models. By leveraging a kernel approximation and other classic ideas from statistics and algorithm design, we are able to reduce the challenging problem of attributing the original DNN to that of attributing a simpler surrogate. We demonstrate the effectiveness of trak across various modalities and scales: image classifiers trained on ImageNet, vision-language models (CLIP), language models (BERT and mT5), and diffusion models. In the second part of this thesis, we explore applications of this framework developed in the first part: First, we leverage datamodels for the problem of learning algorithm comparison, where the goal is to detect differences between models trained with two different learning algorithms. Our algorithm, ModelDiff, enables us to automatically surface biases that distinguish different learning algorithms by differentiating how they use the same training data. Lastly, we tackle the challenging problem of machine unlearning, wherein the goal is to “unlearn” a small fraction of training data from a trained model. By leveraging the fact that datamodels can accurately approximate the “oracle” predictions, we design a simple finetuning algorithm that allows us to unlearn at a significantly smaller cost than prior methods.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleMachine Learning through the Lens of Data
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record