Ensemble of Diversely Trained Support Vector Machines for Protein Fold Recognition
MetadataShow full item record
Protein Fold Recognition (PFR) is defined as assigning a given protein to a fold based on its major secondary structure. PFR is considered as an important step toward protein structure prediction and drug design. However, it still remains as an unsolved problem for biological science and bioinformatics. In this study, we explore the impact of two novel feature extraction methods namely overlapped segmented distribution and overlapped segmented autocorrelation to provide more local discriminatory information for the PFR compared to previously proposed methods found in the literature. We study the impact of our proposed feature extraction methods using 15 promising physicochemical attributes of the amino acids. Afterwards, by proposing an ensemble Support Vector Machines (SVM) which are diversely trained using features extracted from different physicochemical-based attributes, we enhance the protein fold prediction accuracy for up to 5% better than similar studies found in the literature.
Lecture Notes in Computer Science: Intelligent Information and Database Systems 5th Asian Conference Proceedings, Part I
© 2013 Springer-Verlag Berlin Heidelberg. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher. Please refer to the conference's website for access to the definitive, published version.
Pattern Recognition and Data Mining