Ensemble of Diversely Trained Support Vector Machines for Protein Fold Recognition
File version
Author(s)
Sattar, Abdul
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Selamat, A
Nguyen, NT
Haron, H
Date
Size
296568 bytes
File type(s)
application/pdf
Location
Kuala Lumpur, MALAYSIA
License
Abstract
Protein Fold Recognition (PFR) is defined as assigning a given protein to a fold based on its major secondary structure. PFR is considered as an important step toward protein structure prediction and drug design. However, it still remains as an unsolved problem for biological science and bioinformatics. In this study, we explore the impact of two novel feature extraction methods namely overlapped segmented distribution and overlapped segmented autocorrelation to provide more local discriminatory information for the PFR compared to previously proposed methods found in the literature. We study the impact of our proposed feature extraction methods using 15 promising physicochemical attributes of the amino acids. Afterwards, by proposing an ensemble Support Vector Machines (SVM) which are diversely trained using features extracted from different physicochemical-based attributes, we enhance the protein fold prediction accuracy for up to 5% better than similar studies found in the literature.
Journal Title
Conference Title
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT I,
Book Title
Edition
Volume
7802
Issue
PART 1
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
© 2013 Springer-Verlag Berlin Heidelberg. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher. Please refer to the conference's website for access to the definitive, published version.
Item Access Status
Note
Access the data
Related item(s)
Subject
Pattern Recognition and Data Mining