Integrating single-cell RNA-seq datasets with substantial batch effects
Author(s)
Hrovatin, Karin; Moinfar, Amir Ali; Zappia, Luke; Parikh, Shrey; Lapuerta, Alejandro T.; Lengerich, Ben; Kellis, Manolis; Theis, Fabian J.; ... Show more Show less
Download12864_2025_Article_12126.pdf (2.962Mb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Integration of single-cell RNA-sequencing (scRNA-seq) datasets is standard in scRNA-seq analysis. Nevertheless, current computational methods struggle to harmonize datasets across systems such as species, organoids and primary tissue, or different scRNA-seq protocols, including single-cell and single-nuclei. Conditional variational autoencoders (cVAE) are a popular integration method, however, existing strategies for stronger batch correction have limitations. Increasing the Kullback–Leibler divergence regularization does not improve integration and adversarial learning removes biological signals. Here, we propose sysVI, a cVAE-based method employing VampPrior and cycle-consistency constraints. We show that sysVI integrates across systems and improves biological signals for downstream interpretation of cell states and conditions.
Date issued
2025-10-30Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Broad Institute of MIT and HarvardJournal
BMC Genomics
Publisher
BioMed Central
Citation
Hrovatin, K., Moinfar, A., Zappia, L. et al. Integrating single-cell RNA-seq datasets with substantial batch effects. BMC Genomics 26, 974 (2025).
Version: Final published version