MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Methods for Latent Space Interpretation via In-the-loop Fine-Tuning

Author(s)
Wen, Collin
Thumbnail
DownloadThesis PDF (6.410Mb)
Advisor
Lippman, Andrew B.
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
With language models increasing exponentially in scale, being able to interpret and justify model outputs is an area of increasing interest. Although enhancing the performance of these models in chat mediums has been the focus of interaction with AI, the visualization of model latent space offers a novel modality of interpreting information. Embedding models have traditionally served as a means of retrieving relevant information to a topic by converting text into a high-dimensional vector. The high-dimensional vector spaces created via embedding offer a way to encode information that captures similarities and differences in ideas, and visualizing these nuances in terms of meaningful dimensions can offer novel insights into the specific qualities that make two item similar. Leveraging fine-tuning mechanisms, dimension reduction algorithms and Sparse Autoencoders (SAEs), this work surveys state-of-the-art techniques to visualize the latent space in highly interpretable dimensions. ConceptAxes, derived from these techniques, is a framework is provided to produce axes that can capture high-level ideas that are ingrained into embedding models. ConceptAxes with highly interpretable dimensions allow for better justification for the latent space and clusters. This method of increasing embedding transparency proves valuable in various domains: (1) AI-enhanced creative exploration can be more guided and customized for a particular experience and (2) high-level insights can be made more intuitive with vast text datasets.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162996
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.