Show simple item record

dc.contributor.authorHanson, Jack
dc.contributor.authorLitfin, Thomas
dc.contributor.authorPaliwal, Kuldip
dc.contributor.authorZhou, Yaoqi
dc.date.accessioned2020-06-15T05:13:34Z
dc.date.available2020-06-15T05:13:34Z
dc.date.issued2020
dc.identifier.issn1367-4803
dc.identifier.doi10.1093/bioinformatics/btz691
dc.identifier.urihttp://hdl.handle.net/10072/394650
dc.description.abstractMotivation: Protein intrinsic disorder describes the tendency of sequence residues to not fold into a rigid three-dimensional shape by themselves. However, some of these disordered regions can transition from disorder to order when interacting with another molecule in segments known as molecular recognition features (MoRFs). Previous analysis has shown that these MoRF regions are indirectly encoded within the prediction of residue disorder as low-confidence predictions [i.e. in a semi-disordered state P(D)≈0.5]. Thus, what has been learned for disorder prediction may be transferable to MoRF prediction. Transferring the internal characterization of protein disorder for the prediction of MoRF residues would allow us to take advantage of the large training set available for disorder prediction, enabling the training of larger analytical models than is currently feasible on the small number of currently available annotated MoRF proteins. In this paper, we propose a new method for MoRF prediction by transfer learning from the SPOT-Disorder2 ensemble models built for disorder prediction. Results: We confirm that directly training on the MoRF set with a randomly initialized model yields substantially poorer performance on independent test sets than by using the transfer-learning-based method SPOT-MoRF, for both deep and simple networks. Its comparison to current state-of-the-art techniques reveals its superior performance in identifying MoRF binding regions in proteins across two independent testing sets, including our new dataset of >800 protein chains. These test chains share <30% sequence similarity to all training and validation proteins used in SPOT-Disorder2 and SPOT-MoRF, and provide a much-needed large-scale update on the performance of current MoRF predictors. The method is expected to be useful in locating functional disordered regions in proteins.
dc.description.peerreviewedYes
dc.languageEnglish
dc.language.isoeng
dc.publisherOxford University Press
dc.relation.ispartofpagefrom1107
dc.relation.ispartofpageto1113
dc.relation.ispartofissue4
dc.relation.ispartofjournalBioinformatics
dc.relation.ispartofvolume36
dc.subject.fieldofresearchMathematical sciences
dc.subject.fieldofresearchBiological sciences
dc.subject.fieldofresearchInformation and computing sciences
dc.subject.fieldofresearchcode49
dc.subject.fieldofresearchcode31
dc.subject.fieldofresearchcode46
dc.subject.keywordsScience & Technology
dc.subject.keywordsLife Sciences & Biomedicine
dc.subject.keywordsTechnology
dc.subject.keywordsPhysical Sciences
dc.subject.keywordsBiochemical Research Methods
dc.titleIdentifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning
dc.typeJournal article
dc.type.descriptionC1 - Articles
dcterms.bibliographicCitationHanson, J; Litfin, T; Paliwal, K; Zhou, Y, Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning, Bioinformatics, 2020, 36 (4), pp. 1107-1113
dcterms.dateAccepted2019-08-31
dc.date.updated2020-06-15T05:11:57Z
gro.hasfulltextNo Full Text
gro.griffith.authorLitfin, Tom
gro.griffith.authorPaliwal, Kuldip K.


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record