Show simple item record

dc.contributor.authorQian, Yujie
dc.contributor.authorGuo, Jiang
dc.contributor.authorTu, Zhengkai
dc.contributor.authorLi, Zhening
dc.contributor.authorColey, Connor W
dc.contributor.authorBarzilay, Regina
dc.date.accessioned2025-02-11T20:24:06Z
dc.date.available2025-02-11T20:24:06Z
dc.date.issued2023-04-10
dc.identifier.urihttps://hdl.handle.net/1721.1/158192
dc.description.abstractMolecular structure recognition is the task of translating a molecular image into its graph structure. Significant variation in drawing styles and conventions exhibited in chemical literature poses a significant challenge for automating this task. In this paper, we propose MolScribe, a novel image-to-graph generation model that explicitly predicts atoms and bonds, along with their geometric layouts, to construct the molecular structure. Our model flexibly incorporates symbolic chemistry constraints to recognize chirality and expand abbreviated structures. We further develop data augmentation strategies to enhance the model robustness against domain shifts. In experiments on both synthetic and realistic molecular images, MolScribe significantly outperforms previous models, achieving 76-93% accuracy on public benchmarks. Chemists can also easily verify MolScribe's prediction, informed by its confidence estimation and atom-level alignment with the input image. MolScribe is publicly available through Python and web interfaces: https://github.com/thomas0809/MolScribe.en_US
dc.language.isoen
dc.publisherAmerican Chemical Societyen_US
dc.relation.isversionof10.1021/acs.jcim.2c01480en_US
dc.rightsCreative Commons Attribution-Noncommercial-ShareAlikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearxiven_US
dc.titleMolScribe: Robust Molecular Structure Recognition with Image-to-Graph Generationen_US
dc.typeArticleen_US
dc.identifier.citationYujie Qian, Jiang Guo, Zhengkai Tu, Zhening Li, Connor W. Coley, and Regina Barzilay. Journal of Chemical Information and Modeling 2023 63 (7), 1925-1934.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Chemical Engineeringen_US
dc.relation.journalJournal of Chemical Information and Modelingen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2025-02-11T20:12:41Z
dspace.orderedauthorsQian, Y; Guo, J; Tu, Z; Li, Z; Coley, CW; Barzilay, Ren_US
dspace.date.submission2025-02-11T20:12:42Z
mit.journal.volume63en_US
mit.journal.issue7en_US
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record