Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids

No Thumbnail Available
File version
Author(s)
Raicar, Gaurav
Saini, Harsh
Dehzangi, Abdollah
Lal, Sunil
Sharma, Alok
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2016
Size
File type(s)
Location
License
Abstract

Predicting the three-dimensional (3-D) structure of a protein is an important task in the field of bioinformatics and biological sciences. However, directly predicting the 3-D structure from the primary structure is hard to achieve. Therefore, predicting the fold or structural class of a protein sequence is generally used as an intermediate step in determining the protein's 3-D structure. For protein fold recognition (PFR) and structural class prediction (SCP), two steps are required – feature extraction step and classification step. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physicochemical-based information to extract features. In this study, we explore the importance of utilizing the physicochemical properties of amino acids for improving PFR and SCP accuracies. For this, we propose a Forward Consecutive Search (FCS) scheme which aims to strategically select physicochemical attributes that will supplement the existing feature extraction techniques for PFR and SCP. An exhaustive search is conducted on all the existing 544 physicochemical attributes using the proposed FCS scheme and a subset of physicochemical attributes is identified. Features extracted from these selected attributes are then combined with existing syntactical-based and evolutionary-based features, to show an improvement in the recognition and prediction performance on benchmark datasets.

Journal Title

Journal of Theoretical Biology

Conference Title
Book Title
Edition
Volume

402

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Mathematical sciences

Biological sciences

Other biological sciences not elsewhere classified

Persistent link to this record
Citation
Collections