Show simple item record

dc.contributor.advisorPaliwal, Kuldip
dc.contributor.authorHeffernan, Rhys
dc.date.accessioned2018-11-29T01:25:15Z
dc.date.available2018-11-29T01:25:15Z
dc.date.issued2018-11-17
dc.identifier.urihttp://hdl.handle.net/10072/381401
dc.description.abstractIn this thesis we tackle the protein structure prediction subproblems listed previously, by applying state of the art deep learning techniques. The work in chapter 2 presents the method SPIDER. In this method, state of the art deep learning is applied iteratively to the task of predicting backbone torsion angles and , and dihedral angles and , by applying evolutionary-derived sequence pro les and physio-chemical properties of amino acid residues. This work is the fi rst method for the sequence based prediction of and angles. Chapter 3 presents the method SPIDER2. This method takes the state of the art iterative deep learning applied in SPIDER, and extends it to the prediction of three-state secondary structure, solvent accessible surface area, and ; ; , and angles, and achieves the best reported prediction accuracies for all of them (at the date of publication). Chapter 4 further builds on the work done in the previous chapters, and now adds the prediction of half sphere exposure (both C and C based) and contact numbers to SPIDER2, in a method called SPIDER2-HSE. In Chapter 5, Long Short-Term Memory Bidirectional Recurrent Neural Networks were applied to the prediction of three-state secondary structure, solvent accessible surface area, ; ; , and angles, as well as half sphere exposure and contact numbers. Previously methods used for these predictions (including SPIDER2) were typically window based. That is to say that the input data made available to the model for a given residue, is comprised of information for only that residue and a number of residues on either side in the sequence (in the range of 10-20 residues on each side). The use of LSTM-BRNNs in this method allows SPIDER3 to better learn both long and short term interactions within proteins. This advancement again lead to the best reported accuracies for all predicted structural properties. In Chapter 6, the LSTM-BRNN model used in SPIDER3 is applied to the prediction of the same structural property predictions, plus the prediction of eight-state secondary structure, using only single-sequence inputs. That is, structural properties were predicted without using any evolutionary information. This provides a method that provides not only the best reported single-sequence secondary structure and solvent accessible surface area predictions, but the fi rst reported method for the single-sequence based prediction of half sphere exposure, contact numbers, and ; ; , and angles. This study is important as most proteins have few homologous sequences and their evolutionary profi les are inac- curate and time-consuming to calculate. This single-sequence-based technique allows for fast genome-scale screening analysis of protein one-dimensional structural properties.en_US
dc.languageEnglish
dc.language.isoen
dc.publisherGriffith University
dc.publisher.placeBrisbane
dc.subject.keywordsProtein structureen_US
dc.subject.keywordsMachine learning techniquesen_US
dc.subject.keywordsBackbone torsionen_US
dc.subject.keywordsDeep learningen_US
dc.subject.keywordsHalf sphere exposureen_US
dc.subject.keywordsHomologous sequencesen_US
dc.titleAddressing One-Dimensional Protein Structure Prediction Problems with Machine Learning Techniquesen_US
dc.typeGriffith thesisen_US
gro.facultyScience, Environment, Engineering and Technologyen_US
gro.rights.copyrightThe author owns the copyright in this thesis, unless stated otherwise.
gro.hasfulltextFull Text
gro.thesis.degreelevelThesis (PhD Doctorate)en_US
gro.thesis.degreeprogramDoctor of Philosophy (PhD)en_US
gro.departmentSchool of Eng & Built Enven_US


Files in this item

This item appears in the following Collection(s)

Show simple item record