Origin of novel coronavirus causing COVID-19: A computational biology study using artificial intelligence
File version
Version of Record (VoR)
Author(s)
Abdelrazek, Mohamed
Nguyen, Dung Tien
Aryal, Sunil
Nguyen, Duc Thanh
Reddy, Sandeep
Nguyen, Quoc Viet Hung
Khatami, Amin
Nguyen, Thanh Tam
Hsu, Edbert B
Yang, Samuel
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Abstract
Origin of the COVID-19 virus (SARS-CoV-2) has been intensely debated in the scientific community since the first infected cases were detected in December 2019. The disease has caused a global pandemic, leading to deaths of thousands of people across the world and thus finding origin of this novel coronavirus is important in responding and controlling the pandemic. Recent research results suggest that bats or pangolins might be the hosts for SARS-CoV-2 based on comparative studies using its genomic sequences. This paper investigates the SARS-CoV-2 origin by using artificial intelligence (AI)-based unsupervised learning algorithms and raw genomic sequences of the virus. More than 300 genome sequences of COVID-19 infected cases collected from different countries are explored and analysed using unsupervised clustering methods. The results obtained from various AI-enabled experiments using clustering algorithms demonstrate that all examined SARS-CoV-2 genomes belong to a cluster that also contains bat and pangolin coronavirus genomes. This provides evidence strongly supporting scientific hypotheses that bats and pangolins are probable hosts for SARS-CoV-2. At the whole genome analysis level, our findings also indicate that bats are more likely the hosts for the COVID-19 virus than pangolins.
Journal Title
Machine Learning with Applications
Conference Title
Book Title
Edition
Volume
9
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
© 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Item Access Status
Note
Access the data
Related item(s)
Subject
Artificial intelligence
Machine learning
AI
Bat
COVID-19
Machine learning
Persistent link to this record
Citation
Nguyen, TT; Abdelrazek, M; Nguyen, DT; Aryal, S; Nguyen, DT; Reddy, S; Nguyen, QVH; Khatami, A; Nguyen, TT; Hsu, EB; Yang, S, Origin of novel coronavirus causing COVID-19: A computational biology study using artificial intelligence, Machine Learning with Applications, 2022, 9, pp. 100328