Enhanced Feature Extraction from Evolutionary Profiles for Protein Fold Recognition
File version
Author(s)
Primary Supervisor
Paliwal, Kuldip
Other Supervisors
So, Stephen
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
Proteins are important biological macromolecules that play important roles in al- most all biological reactions. The function of a protein is dependent on the shape it folds in to, which is in turn dependent on the protein’s amino acid sequence. Ex- perimental approaches for determining a protein’s 3D structure are expensive and time consuming, so computational methods for determining the structure from the amino acid sequence are desired. Methods for directly computing the 3D structure of a protein exist, however they are impractical for large proteins and high resolution models due to the large search space. Instead of trying to directly find the 3D struc- ture from first principles, the primary structure can be compared to proteins with known 3D structure. A ‘fold’ is a way of classifying proteins with the same major secondary structures in the same arrangement and with the same topological con- nections. Protein Fold Recognition (PFR) is an important step towards determining a protein’s structure, simplifying the protein structure prediction problem. This is a multi-class classification problem solvable using machine learning techniques. The PFR problem has been widely studied in the past, with feature extraction approaches including using counts of amino acids and pairs of amino acids, physic- ochemical information, evolutionary information from the Position Specific Scoring Matrix (PSSM), and structural information from its predicted secondary structure. These approaches do work, but with limited success. Current state of the art features use information from the PSSM as well as the predicted secondary structure.
Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Thesis (PhD Doctorate)
Degree Program
Doctor of Philosophy (PhD)
School
Griffith School of Engineering
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
The author owns the copyright in this thesis, unless stated otherwise.
Item Access Status
Public
Note
Access the data
Related item(s)
Subject
Proteins
protein amino acid sequence
Protein Fold Recognition (PFR)
Position Specific Scoring Matrix (PSSM),