scAdapt: virtual adversarial domain adaptation network for single cell RNA-seq data classification across platforms and species

View/ Open
File version
Accepted Manuscript (AM)
Author(s)
Zhou, Xiang
Chai, Hua
Zeng, Yuansong
Zhao, Huiying
Yang, Yuedong
Griffith University Author(s)
Year published
2021
Metadata
Show full item recordAbstract
In single cell analyses, cell types are conventionally identified based on expressions of known marker genes, whose identifications are time-consuming and irreproducible. To solve this issue, many supervised approaches have been developed to identify cell types based on the rapid accumulation of public datasets. However, these approaches are sensitive to batch effects or biological variations since the data distributions are different in cross-platforms or species predictions. In this study, we developed scAdapt, a virtual adversarial domain adaptation network, to transfer cell labels between datasets with batch effects. ...
View more >In single cell analyses, cell types are conventionally identified based on expressions of known marker genes, whose identifications are time-consuming and irreproducible. To solve this issue, many supervised approaches have been developed to identify cell types based on the rapid accumulation of public datasets. However, these approaches are sensitive to batch effects or biological variations since the data distributions are different in cross-platforms or species predictions. In this study, we developed scAdapt, a virtual adversarial domain adaptation network, to transfer cell labels between datasets with batch effects. scAdapt used both the labeled source and unlabeled target data to train an enhanced classifier and aligned the labeled source centroids and pseudo-labeled target centroids to generate a joint embedding. The scAdapt was demonstrated to outperform existing methods for classification in simulated, cross-platforms, cross-species, spatial transcriptomic and COVID-19 immune datasets. Further quantitative evaluations and visualizations for the aligned embeddings confirm the superiority in cell mixing and the ability to preserve discriminative cluster structure present in the original datasets.
View less >
View more >In single cell analyses, cell types are conventionally identified based on expressions of known marker genes, whose identifications are time-consuming and irreproducible. To solve this issue, many supervised approaches have been developed to identify cell types based on the rapid accumulation of public datasets. However, these approaches are sensitive to batch effects or biological variations since the data distributions are different in cross-platforms or species predictions. In this study, we developed scAdapt, a virtual adversarial domain adaptation network, to transfer cell labels between datasets with batch effects. scAdapt used both the labeled source and unlabeled target data to train an enhanced classifier and aligned the labeled source centroids and pseudo-labeled target centroids to generate a joint embedding. The scAdapt was demonstrated to outperform existing methods for classification in simulated, cross-platforms, cross-species, spatial transcriptomic and COVID-19 immune datasets. Further quantitative evaluations and visualizations for the aligned embeddings confirm the superiority in cell mixing and the ability to preserve discriminative cluster structure present in the original datasets.
View less >
Journal Title
Briefings in Bioinformatics
Volume
22
Issue
6
Copyright Statement
© 2021 Oxford University Press. This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Briefings in Bioinformatics following peer review. The definitive publisher-authenticated version scAdapt: virtual adversarial domain adaptation network for single cell RNA-seq data classification across platforms and species, Briefings in Bioinformatics, 2021, 22 (6) is available online at: https://doi.org/10.1093/bib/bbab281.
Subject
Biochemistry and cell biology
Theory of computation
Other information and computing sciences
Science & Technology
Life Sciences & Biomedicine
Biochemical Research Methods
Mathematical & Computational Biology
Biochemistry & Molecular Biology