Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning
Author(s)
Heffernan, Rhys
Paliwal, Kuldip
Lyons, James
Singh, Jaswinder
Yang, Yuedong
Zhou, Yaoqi
Griffith University Author(s)
Year published
2018
Metadata
Show full item recordAbstract
Predicting protein structure from sequence alone is challenging. Thus, the majority of methods for protein structure prediction rely on evolutionary information from multiple sequence alignments. In previous work we showed that Long Short‐Term Bidirectional Recurrent Neural Networks (LSTM‐BRNNs) improved over regular neural networks by better capturing intra‐sequence dependencies. Here we show a single‐sequence‐based prediction method employing LSTM‐BRNNs (SPIDER3‐Single), that consistently achieves Q3 accuracy of 72.5%, and correlation coefficient of 0.67 between predicted and actual solvent accessible surface area. Moreover, ...
View more >Predicting protein structure from sequence alone is challenging. Thus, the majority of methods for protein structure prediction rely on evolutionary information from multiple sequence alignments. In previous work we showed that Long Short‐Term Bidirectional Recurrent Neural Networks (LSTM‐BRNNs) improved over regular neural networks by better capturing intra‐sequence dependencies. Here we show a single‐sequence‐based prediction method employing LSTM‐BRNNs (SPIDER3‐Single), that consistently achieves Q3 accuracy of 72.5%, and correlation coefficient of 0.67 between predicted and actual solvent accessible surface area. Moreover, it yields reasonably accurate prediction of eight‐state secondary structure, main‐chain angles (backbone ϕ and ψ torsion angles and C α‐atom‐based θ and τ angles), half‐sphere exposure, and contact number. The method is more accurate than the corresponding evolutionary‐based method for proteins with few sequence homologs, and computationally efficient for large‐scale screening of protein‐structural properties. It is available as an option in the SPIDER3 server, and a standalone version for download, at http://sparks-lab.org. © 2018 Wiley Periodicals, Inc.
View less >
View more >Predicting protein structure from sequence alone is challenging. Thus, the majority of methods for protein structure prediction rely on evolutionary information from multiple sequence alignments. In previous work we showed that Long Short‐Term Bidirectional Recurrent Neural Networks (LSTM‐BRNNs) improved over regular neural networks by better capturing intra‐sequence dependencies. Here we show a single‐sequence‐based prediction method employing LSTM‐BRNNs (SPIDER3‐Single), that consistently achieves Q3 accuracy of 72.5%, and correlation coefficient of 0.67 between predicted and actual solvent accessible surface area. Moreover, it yields reasonably accurate prediction of eight‐state secondary structure, main‐chain angles (backbone ϕ and ψ torsion angles and C α‐atom‐based θ and τ angles), half‐sphere exposure, and contact number. The method is more accurate than the corresponding evolutionary‐based method for proteins with few sequence homologs, and computationally efficient for large‐scale screening of protein‐structural properties. It is available as an option in the SPIDER3 server, and a standalone version for download, at http://sparks-lab.org. © 2018 Wiley Periodicals, Inc.
View less >
Journal Title
Journal of Computational Chemistry
Volume
39
Issue
26
Subject
Physical chemistry
Theoretical and computational chemistry
Theoretical and computational chemistry not elsewhere classified
Nanotechnology
Backbone angles
Protein structure prediction
Contact prediction
Solvent accessibility prediction
Secondary structure prediction