scPhen: Single-Cell Phenotype Predictor for Alzheimer’s Disease
Author(s)
Guo, Sophie J.
DownloadThesis PDF (6.509Mb)
Advisor
Kellis, Manolis
Terms of use
Metadata
Show full item recordAbstract
Advances in artificial intelligence (AI) and generative AI for representation learning have transformed our ability to model complex biological systems. Single-cell RNA sequencing (scRNA-seq) provides unprecedented resolution into cellular heterogeneity, offering a powerful substrate for modeling disease circuitry. However, predicting patient-level phenotypes from scRNA-seq remains challenging due to limited sample sizes, variable cell counts, and the computational burden of modeling long-context dependencies. We present scPhen, a flexible, parametric deep-learning framework for phenotype prediction from single-cell transcriptomic data, applied here to Alzheimer’s disease (AD) as a paradigm of complex, heterogeneous pathology. scPhen consists of a cell embedding module and a patient embedding module, designed to capture both fine-grained molecular patterns and higher-order cell–cell relationships. The framework supports multiple architectural backbones, including Transformers, Graph Neural Networks (GNNs), and state-space models such as Mamba, Mamba2, and BiMamba2, allowing exploration of tunable components for optimized performance. Across classification and regression tasks, state-space models, and in particular BiMamba2, demonstrated superior predictive accuracy and computational efficiency compared to Transformer-based and hybrid approaches. We further integrated attention-based multiple instance learning to enable variable cell counts per patient and to prioritize phenotype-informative cellular subsets. Interpretability analyses using Integrated Gradients and cell-level attention scores revealed gene programs and cell populations associated with AD progression, highlighting known neuroinflammatory signatures and suggesting novel molecular targets. By unifying cutting-edge sequence modeling architectures with scalable single-cell analysis, scPhen provides a generalizable, high-resolution approach to phenotype prediction. While demonstrated here in AD, this framework is readily extensible to other complex diseases and multi-modal cellular datasets, bridging computational innovation and biological discovery.
Date issued
2025-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology