EVlncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning

Loading...
Thumbnail Image
File version

Version of Record (VoR)

Author(s)
Zhou, Bailing
Ding, Maolin
Feng, Jing
Ji, Baohua
Huang, Pingping
Zhang, Junye
Yu, Xue
Cao, Zanxia
Yang, Yuedong
Zhou, Yaoqi
Wang, Jihua
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location
Abstract

Long non-coding RNAs (lncRNAs) played essential roles in nearly every biological process and disease. Many algorithms were developed to distinguish lncRNAs from mRNAs in transcriptomic data and facilitated discoveries of more than 600 000 of lncRNAs. However, only a tiny fraction (<1%) of lncRNA transcripts (~4000) were further validated by low-throughput experiments (EVlncRNAs). Given the cost and labor-intensive nature of experimental validations, it is necessary to develop computational tools to prioritize those potentially functional lncRNAs because many lncRNAs from high-throughput sequencing (HTlncRNAs) could be resulted from transcriptional noises. Here, we employed deep learning algorithms to separate EVlncRNAs from HTlncRNAs and mRNAs. For overcoming the challenge of small datasets, we employed a three-layer deep-learning neural network (DNN) with a K-mer feature as the input and a small convolutional neural network (CNN) with one-hot encoding as the input. Three separate models were trained for human (h), mouse (m) and plant (p), respectively. The final concatenated models (EVlncRNA-Dpred (h), EVlncRNA-Dpred (m) and EVlncRNA-Dpred (p)) provided substantial improvement over a previous model based on support-vector-machines (EVlncRNA-pred). For example, EVlncRNA-Dpred (h) achieved 0.896 for the area under receiver-operating characteristic curve, compared with 0.582 given by sequence-based EVlncRNA-pred model. The models developed here should be useful for screening lncRNA transcripts for experimental validations. EVlncRNA-Dpred is available as a web server at https://www.sdklab-biophysics-dzu.net/EVlncRNA-Dpred/index.html, and the data and source code can be freely available along with the web server.

Journal Title

Briefings in Bioinformatics

Conference Title
Book Title
Edition
Volume

24

Issue

1

Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© The Author(s) 2022. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

Item Access Status
Note
Access the data
Related item(s)
Subject

Deep learning

Bioinformatics and computational biology

Science & Technology

Life Sciences & Biomedicine

Biochemical Research Methods

Mathematical & Computational Biology

Biochemistry & Molecular Biology

Persistent link to this record
Citation

Zhou, B; Ding, M; Feng, J; Ji, B; Huang, P; Zhang, J; Yu, X; Cao, Z; Yang, Y; Zhou, Y; Wang, J, EVlncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning, Briefings in Bioinformatics, 2022, 24 (1), pp. bbac583

Collections