Direct prediction of the profile of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles
MetadataShow full item record
Locating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences.
© 2014 Wiley Periodicals, Inc. This is the author-manuscript version of the following article: Direct prediction of the profile of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles, Proteins: Structure, Function, and Bioinformatics, Vol. 82(10), 2014, pp. 2565-2573, which has been published in final form at dx.doi.org/10.1002/prot.24620.
Pattern Recognition and Data Mining
Structural Biology (incl. Macromolecular Modelling)