• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Theses
    • Theses - Higher Degree by Research
    • View Item
    • Home
    • Griffith Theses
    • Theses - Higher Degree by Research
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • Protein Structure Prediction by Recurrent and Convolutional Deep Neural Network Architectures

    Thumbnail
    View/Open
    Hanson, Jack_Final Thesis_redacted.pdf (4.415Mb)
    Author(s)
    Hanson, Jack S.
    Primary Supervisor
    Paliwal, Kuldip
    Other Supervisors
    So, Stephen
    Year published
    2018-11-30
    Metadata
    Show full item record
    Abstract
    In this thesis, the application of convolutional and recurrent machine learning techniques to several key structural properties of proteins is explored. Chapter 2 presents the rst application of an LSTM-BRNN in structural bioinformat- ics. The method, called SPOT-Disorder, predicts the per-residue probability of a protein being intrinsically disordered (ie. unstructured, or exible). Using this methodology, SPOT-Disorder achieved the highest accuracy in the literature without separating short and long disordered regions during training as was required in previous models, and was additionally proven capable of indirectly ...
    View more >
    In this thesis, the application of convolutional and recurrent machine learning techniques to several key structural properties of proteins is explored. Chapter 2 presents the rst application of an LSTM-BRNN in structural bioinformat- ics. The method, called SPOT-Disorder, predicts the per-residue probability of a protein being intrinsically disordered (ie. unstructured, or exible). Using this methodology, SPOT-Disorder achieved the highest accuracy in the literature without separating short and long disordered regions during training as was required in previous models, and was additionally proven capable of indirectly discerning functional sites located in disordered regions. Chapter 3 extends the application of an LSTM-BRNN to a two-dimensional problem in the prediction of protein contact maps. Protein contact maps describe the intra-sequence distance between each residue pairing at a distance cuto , providing key restraints towards the possible conformations of a protein. This work, entitled SPOT-Contact, introduced the coupling of two-dimensional LSTM-BRNNs with ResNets to maximise dependency propagation in order to achieve the highest reported accuracies for contact map preci- sion. Several models of varying architectures were trained and combined as an ensemble predictor in order to minimise incorrect generalisations. Chapter 4 discusses the utilisation of an ensemble of LSTM-BRNNs and ResNets to predict local protein one-dimensional structural properties. The method, called SPOT-1D, predicts for a wide range of local structural descriptors, including several solvent exposure metrics, secondary structure, and real-valued backbone angles. SPOT-1D was signi cantly improved by the inclusion of the outputs of SPOT-Contact in the input features. Using this topology led to the best reported accuracy metrics for all predicted properties. The protein structures constructed by the backbone angles predicted by SPOT-1D achieved the lowest average error from their native structures in the literature. Chapter 5 presents an update on SPOT-Disorder, as it employs the inputs from SPOT- 1D in conjunction with an ensemble of LSTM-BRNN's and Inception Residual Squeeze and Excitation networks to predict for protein intrinsic disorder. This model con rmed the enhancement provided by utilising the coupled architectures over the LSTM-BRNN solely, whilst also introducing a new convolutional format to the bioinformatics eld. The work in Chapter 6 utilises the same topology from SPOT-1D for single-sequence prediction of protein intrinsic disorder in SPOT-Disorder-Single. Single-sequence predic- tion describes the prediction of a protein's properties without the use of evolutionary information. While evolutionary information generally improves the performance of a computational model, it comes at the expense of a greatly increased computational and time load. Removing this from the model allows for genome-scale protein analysis at a minor drop in accuracy. However, models trained without evolutionary profi les can be more accurate for proteins with limited and therefore unreliable evolutionary information.
    View less >
    Thesis Type
    Thesis (PhD Doctorate)
    Degree Program
    Doctor of Philosophy (PhD)
    School
    School of Eng & Built Env
    DOI
    https://doi.org/10.25904/1912/3830
    Copyright Statement
    The author owns the copyright in this thesis, unless stated otherwise.
    Subject
    Protein structure prediction
    Deep neural network architectures
    SPOT-disorder
    SPOT-contact
    SPOT-1D
    SPOT-disorder-single
    Publication URI
    http://hdl.handle.net/10072/382722
    Collection
    • Theses - Higher Degree by Research

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E
    • TEQSA: PRV12076

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander