MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Generative Machine Learning Models for RNA Structure Prediction and Design

Author(s)
Rubin, Dana
Thumbnail
DownloadThesis PDF (3.948Mb)
Advisor
Jacobson, Joseph
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Ribonucleic acid (RNA) is a fundamental molecule in biology, central to the regulation and execution of life’s most essential processes. Its diverse roles range from encoding genetic information to catalyzing biochemical reactions. Beyond its modern biological functions, RNA is also believed to have played a pivotal role in the origins of life which underscores the evolutionary significance of RNA. Unlocking the full potential of RNA research and design requires a deep understanding of the intricate relationship between RNA’s three-dimensional structure and sequence. Predicting RNA 3D structures remains a challenging problem due to the complexity of its folding landscape and the limited availability of high-resolution structural data. Inspired by recent advances in deep learning for protein folding and design, this thesis explores novel geometric and generative architectures for modeling RNA. We first present a systematic study on RNA structure prediction using equivariant neural networks within diffusion probabilistic models (DDPMs). Our folding model, named Klotho, captures local atomic interactions and structural features using SO(3)-equivariant message passing layers with a point cloud data representation. Ablation studies confirm that Klotho’s model performance scales with higher dimensionality and improves with enriching the input with secondary structure information and sequence embeddings from RNA foundation models. Building on this foundation, we introduce RiboGen, a multi modal deep learning model to jointly generate both RNA sequence and all-atom 3D structure. RiboGen integrates Flow Matching and Discrete Flow Matching within a unified multi modal representation and employs Euclidean Equivariant Neural Networks to learn geometric features. Our results demonstrate that RiboGen can generate chemically plausible, self-consistent RNA molecules, highlighting the potential of co-generative models to explore the sequence–structure landscape of RNA in a unified, data-driven framework. Together, these contributions advance the field of RNA modeling by offering scalable, symmetry-aware architectures for prediction and design. They lay the groundwork for future generative systems in RNA biology, therapeutic development, and biotechnological innovations.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/163021
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.