Show simple item record

dc.contributor.authorSingh, Jaspreet
dc.contributor.authorLitfin, Thomas
dc.contributor.authorPaliwal, Kuldip
dc.contributor.authorSingh, Jaswinder
dc.contributor.authorHanumanthappa, Anil Kumar
dc.contributor.authorZhou, Yaoqi
dc.date.accessioned2021-05-20T06:29:16Z
dc.date.available2021-05-20T06:29:16Z
dc.date.issued2021
dc.identifier.issn1367-4803
dc.identifier.doi10.1093/bioinformatics/btab316
dc.identifier.urihttp://hdl.handle.net/10072/404530
dc.description.abstractMOTIVATION: Knowing protein secondary and other one-dimensional structural properties are essential for accurate protein structure and function prediction. As a result, many methods have been developed for predicting these one-dimensional structural properties. However, most methods relied on evolutionary information that may not exist for many proteins due to a lack of sequence homologs. Moreover, it is computationally intensive for obtaining evolutionary information as the library of protein sequences continues to expand exponentially. Here we developed a new single-sequence method called SPOT-1D-Single based on a large training dataset of 39120 proteins deposited prior to 2016 and an ensemble of hybrid Long-Short-Term-Memory bidirectional neural network and convolutional neural network. RESULTS: We showed that SPOT-1D-Single consistently improves over SPIDER3-Single and ProteinUnet for secondary structure, solvent accessibility, contact number, and backbone angles prediction for all seven independent test sets (TEST2018, SPOT-2016, SPOT-2016-HQ, SPOT-2018, SPOT-2018-HQ, CASP12, and CASP13 free-modeling targets). For example, the predicted three-state secondary structure's accuracy ranges from 72.12-74.28% by SPOT-1D-Single, compared to 69.1-72.6% by SPIDER3-Single and 70.6-73% by ProteinUnet. SPOT-1D-Single also predicts SS3 and SS8 with 6.24% and 6.98% better accuracy than SPOT-1D on SPOT-2018 proteins with no homologs (Neff=1), respectively. The new method's improvement over existing techniques is due to a larger training set combined with ensembled learning. AVAILABILITY: Standalone-version of SPOT-1D-Single is available at https://github.com/jas-preet/SPOT-1D-Single. Direct prediction can also be made at https://sparks-lab.org/server/spot-1d-single. The datasets used in this research can also be downloaded from GitHub.
dc.description.peerreviewedYes
dc.languageeng
dc.publisherOxford University Press (OUP)
dc.relation.ispartofjournalBioinformatics
dc.relation.urihttp://purl.org/au-research/grants/ARC/DP210101875
dc.relation.grantIDDP210101875
dc.relation.fundersARC
dc.subject.fieldofresearchMathematical Sciences
dc.subject.fieldofresearchBiological Sciences
dc.subject.fieldofresearchInformation and Computing Sciences
dc.subject.fieldofresearchcode01
dc.subject.fieldofresearchcode06
dc.subject.fieldofresearchcode08
dc.titleSPOT-1D-Single: Improving the Single-Sequence-Based Prediction of Protein Secondary Structure, Backbone Angles, Solvent Accessibility and Half-Sphere Exposures using a Large Training Set and Ensembled Deep Learning
dc.typeJournal article
dc.type.descriptionC1 - Articles
dcterms.bibliographicCitationSingh, J; Litfin, T; Paliwal, K; Singh, J; Hanumanthappa, AK; Zhou, Y, SPOT-1D-Single: Improving the Single-Sequence-Based Prediction of Protein Secondary Structure, Backbone Angles, Solvent Accessibility and Half-Sphere Exposures using a Large Training Set and Ensembled Deep Learnin., Bioinformatics, 2021
dcterms.dateAccepted2021-04-26
dc.date.updated2021-05-20T03:26:51Z
dc.description.versionAccepted Manuscript (AM)
gro.description.notepublicThis publication has been entered as an advanced online version in Griffith Research Online.
gro.rights.copyright© 2021 Oxford University Press. This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version SPOT-1D-Single: Improving the Single-Sequence-Based Prediction of Protein Secondary Structure, Backbone Angles, Solvent Accessibility and Half-Sphere Exposures using a Large Training Set and Ensembled Deep Learning, Bioinformatics, 2021 is available online at: https://doi.org/10.1093/bioinformatics/btab316.
gro.hasfulltextFull Text
gro.griffith.authorSingh, Jaswinder
gro.griffith.authorLitfin, Tom
gro.griffith.authorSingh, Jaspreet
gro.griffith.authorPaliwal, Kuldip K.
gro.griffith.authorHanumanthappa, Anil Kumar
gro.griffith.authorZhou, Yaoqi


Files in this item

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record