• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Theses
    • Theses - Higher Degree by Research
    • View Item
    • Home
    • Griffith Theses
    • Theses - Higher Degree by Research
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • Protein Function Prediction by Machine Learning

    Thumbnail
    View/Open
    Taherzadeh,Ghazaleh_Thesis_Redacted.pdf (2.564Mb)
    Author(s)
    Taherzadeh, Ghazaleh
    Primary Supervisor
    Liew, Wee-Chung
    Zhou, Yaoqi
    Other Supervisors
    Yang, Yuedong
    Year published
    2018-05
    Metadata
    Show full item record
    Abstract
    Overwhelmed with genomic data, determining functions of previously unseen proteins is one of the most challenging problems. While most protein functions can often be inferred from their homologous counterparts with known functions in other species, not all proteins have homologs whose functions were determined. The functional roles are performed by interactions between proteins and other biologically active molecules. Thus, the first step to identify protein function through its interaction is to detect potential binding sites of the protein. Moreover, protein functions may alter when proteins undergo some modifications. ...
    View more >
    Overwhelmed with genomic data, determining functions of previously unseen proteins is one of the most challenging problems. While most protein functions can often be inferred from their homologous counterparts with known functions in other species, not all proteins have homologs whose functions were determined. The functional roles are performed by interactions between proteins and other biologically active molecules. Thus, the first step to identify protein function through its interaction is to detect potential binding sites of the protein. Moreover, protein functions may alter when proteins undergo some modifications. Obviously, experimental determination of functions for millions of new proteins is not practical due to vast amount of possible functions to be tested. Thus, it is highly desirable to have computational tools to prioritize possible functions for new proteins. In this thesis, we proposed machine learning-based methods for predicting putative binding sites of proteins interacting with small molecules, specifically peptides and carbohydrates, in addition to predicting putative sites of post-translational modifications (PTMs). The main contributions of our methods lie in three aspects. First, we proposed the first predictive model to predict protein-peptide binding sites without the knowledge of the protein structure (Taherzadeh et al. 2016). The method was further improved by using experimental structures. The performance of the method is robust even if unbound structures or quality model structures built from homologs were employed, indicating the wide applicability of the method developed (Taherzadeh et al. 2017). Second, we established the first publicly available tool for predicting carbohydrate binding sites in the absence of protein structures (Taherzadeh et al. 2016). Accurate performance of this method is confirmed by predicting more binding residues in carbohydrate-binding proteins than in non-binding proteins in human proteome and by its successful application to 1000 Genomes Project. Third, we proposed a method for predicting post-translational modification (PTM) site of lysine malonylation (Taherzadeh et al.). This predictive model built from M. musculus proteins achieved comparable performance when tested on H. sapiens proteins. All aforementioned methods are thoroughly assessed on cross-validation and the independent test sets after removing homologue sequences. Consistent performance on cross-validation and independent datasets confirmed the accuracy and robustness of predictive methods. All methods significantly outperform existing techniques.
    View less >
    Thesis Type
    Thesis (PhD Doctorate)
    Degree Program
    Doctor of Philosophy (PhD)
    School
    School of Info & Comm Tech
    DOI
    https://doi.org/10.25904/1912/1478
    Copyright Statement
    The author owns the copyright in this thesis, unless stated otherwise.
    Subject
    Protein function prediction
    Machine learning
    Peptides
    Carbohydrates
    Lysine malonylation
    Publication URI
    http://hdl.handle.net/10072/376837
    Collection
    • Theses - Higher Degree by Research

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E
    • TEQSA: PRV12076

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander