A Tri-Gram Based Feature Extraction Technique Using Linear Probabilities of Position Specific Scoring Matrix for Protein Fold Recognition

View/ Open
Author(s)
Paliwal, Kuldip K
Sharma, Alok
Lyons, James
Dehzangi, Abdollah
Griffith University Author(s)
Year published
2014
Metadata
Show full item recordAbstract
In biological sciences, the deciphering of a three dimensional structure of a protein sequence is considered to be an important and challenging task. The identification of protein folds from primary protein sequences is an intermediate step in discovering the three dimensional structure of a protein. This can be done by utilizing feature extraction technique to accurately extract all the relevant information followed by employing a suitable classifier to label an unknown protein. In the past, several feature extraction techniques have been developed but with limited recognition accuracy only. In this study, we have developed ...
View more >In biological sciences, the deciphering of a three dimensional structure of a protein sequence is considered to be an important and challenging task. The identification of protein folds from primary protein sequences is an intermediate step in discovering the three dimensional structure of a protein. This can be done by utilizing feature extraction technique to accurately extract all the relevant information followed by employing a suitable classifier to label an unknown protein. In the past, several feature extraction techniques have been developed but with limited recognition accuracy only. In this study, we have developed a feature extraction technique based on tri-grams computed directly from Position Specific Scoring Matrices. The effectiveness of the feature extraction technique has been shown on two benchmark datasets. The proposed technique exhibits up to 4.4% improvement in protein fold recognition accuracy compared to the state-of-the-art feature extraction techniques.
View less >
View more >In biological sciences, the deciphering of a three dimensional structure of a protein sequence is considered to be an important and challenging task. The identification of protein folds from primary protein sequences is an intermediate step in discovering the three dimensional structure of a protein. This can be done by utilizing feature extraction technique to accurately extract all the relevant information followed by employing a suitable classifier to label an unknown protein. In the past, several feature extraction techniques have been developed but with limited recognition accuracy only. In this study, we have developed a feature extraction technique based on tri-grams computed directly from Position Specific Scoring Matrices. The effectiveness of the feature extraction technique has been shown on two benchmark datasets. The proposed technique exhibits up to 4.4% improvement in protein fold recognition accuracy compared to the state-of-the-art feature extraction techniques.
View less >
Journal Title
IEEE Transactions on NanoBioscience
Volume
13
Issue
1
Copyright Statement
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Subject
Biomedical engineering