Show simple item record

dc.contributor.authorChai, H
dc.contributor.authorZhou, X
dc.contributor.authorZhang, Z
dc.contributor.authorRao, J
dc.contributor.authorZhao, H
dc.contributor.authorYang, Y
dc.date.accessioned2021-05-31T01:44:51Z
dc.date.available2021-05-31T01:44:51Z
dc.date.issued2021
dc.identifier.issn0010-4825
dc.identifier.doi10.1016/j.compbiomed.2021.104481
dc.identifier.urihttp://hdl.handle.net/10072/404766
dc.description.abstractBackground: Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients’ risks. Results: By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Conclusion: Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
dc.description.peerreviewedYes
dc.languageen
dc.publisherElsevier BV
dc.relation.ispartofpagefrom104481
dc.relation.ispartofjournalComputers in Biology and Medicine
dc.relation.ispartofvolume134
dc.subject.fieldofresearchEngineering
dc.subject.fieldofresearchBiomedical and clinical sciences
dc.subject.fieldofresearchcode40
dc.subject.fieldofresearchcode32
dc.titleIntegrating multi-omics data through deep learning for accurate cancer prognosis prediction
dc.typeJournal article
dc.type.descriptionC1 - Articles
dcterms.bibliographicCitationChai, H; Zhou, X; Zhang, Z; Rao, J; Zhao, H; Yang, Y, Integrating multi-omics data through deep learning for accurate cancer prognosis prediction, Computers in Biology and Medicine, 2021, 134, pp. 104481
dcterms.licensehttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.date.updated2021-05-31T01:39:44Z
dc.description.versionAccepted Manuscript (AM)
gro.rights.copyright© 2021 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.
gro.hasfulltextFull Text
gro.griffith.authorYang, Yuedong


Files in this item

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record