MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Geometric representation learning for chemical property prediction, structure elucidation, and molecular design

Author(s)
Adams, Keir Alexander Joseph
Thumbnail
DownloadThesis PDF (123.3Mb)
Advisor
Coley, Connor W.
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Molecular representation learning has revolutionized computer-aided chemistry by enabling the automatic extraction of arbitrarily complex patterns from datasets of (potentially labeled) molecular structures via deep neural networks. In predictive chemistry, deep learning is increasingly being used to replace expensive physics-based simulations and even experimental measurements of chemical properties. In generative chemistry, deep generative models are powering molecular design and optimization campaigns across chemical industries. Notably, this paradigm shift has been driven by the development of sophisticated representation learning algorithms that encode and decode molecular structures with increasing geometric detail – from minimal SMILES strings to elaborate atomistic structures. Yet, many aspects of molecular structure remain neglected by leading geometric representation learning models. Accordingly, this thesis advances the geometric representation learning of molecular structure to create new opportunities in chemical property prediction, structure elucidation, and molecular design. This thesis begins by highlighting surprising failure modes of graph neural networks when predicting properties dependent on chirality and conformational isomerism. A new stereochemistry-tailored model is then developed to imbue graph networks with tetrahedral chiral expressivity while evading pitfalls plaguing preceding 2D and 3D graph networks. This thesis then examines how the geometric quality of structures encoded by 3D networks impacts their accuracy in property prediction tasks requiring the model to reason about conformational flexibility. Neglecting structural characteristics that are challenging to model is also common in computational chemistry. In nuclear magnetic resonance (NMR) prediction, for example, quantum chemical calculations typically estimate magnetic shieldings from stationary gas-phase geometries – ignoring vibrations and explicit solvent. To advance chemical structure elucidation, this thesis next develops neural surrogates for magnetic shielding calculations that, when integrated with molecular dynamics simulations, provide access to unprecedented accuracy in solvent-sensitive NMR spectra prediction. Finally, this thesis advances de novo molecular design by explicitly representing 3D shapes, electrostatics, and non-covalent interactions in deep generative models for small molecules. A shape-conditioned variational autoencoder is first developed to design chemically diverse molecules that can adopt desired conformational shapes, like ligand binding poses. This strategy is then generalized into a powerful interaction-aware diffusion modeling framework to comprehensively enable bioisosteric replacement in ligand-based drug design.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/159888
Department
Massachusetts Institute of Technology. Department of Chemical Engineering; Massachusetts Institute of Technology. Center for Computational Science and Engineering
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.