EvoStruct-Sub: An accurate Gram-positive protein subcellular localization predictor using evolutionary and structural features

Loading...
Thumbnail Image
File version
Accepted Manuscript (AM)
Author(s)
Uddin, Md Raihan
Sharma, Alok
Farid, Dewan Md
Rahman, Md Mahmudur
Dehzangi, Abdollah
Shatabda, Swakkhar
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2018
Size
File type(s)
Location
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract

Determining subcellular localization of proteins is considered as an important step towards understanding their functions. Previous studies have mainly focused solely on Gene Ontology (GO) as the main feature to tackle this problem. However, it was shown that features extracted based on GO is hard to be used for new proteins with unknown GO. At the same time, evolutionary information extracted from Position Specific Scoring Matrix (PSSM) have been shown as another effective features to tackle this problem. Despite tremendous advancement using these sources for feature extraction, this problem still remains unsolved. In this study we propose EvoStruct-Sub which employs predicted structural information in conjunction with evolutionary information extracted directly from the protein sequence to tackle this problem. To do this we use several different feature extraction method that have been shown promising in subcellular localization as well as similar studies to extract effective local and global discriminatory information. We then use Support Vector Machine (SVM) as our classification technique to build EvoStruct-Sub. As a result, we are able to enhance Gram-positive subcellular localization prediction accuracies by up to 5.6% better than previous studies including the studies that used GO for feature extraction.

Journal Title
Journal of Theoretical Biology
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
© 2018 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Mathematical sciences
Biological sciences
Other biological sciences not elsewhere classified
Persistent link to this record
Citation
Collections