De Novo protein design by an energy function based on series expansion in distance and orientation dependence

View/ Open
File version
Accepted Manuscript (AM)
Author(s)
Liang, Shide
Li, Zhixiu
Zhan, Jian
Zhou, Yaoqi
Year published
2021
Metadata
Show full item recordAbstract
MOTIVATION: Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions ...
View more >MOTIVATION: Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys, and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS: The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38 to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY: The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
View less >
View more >MOTIVATION: Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys, and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS: The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38 to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY: The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
View less >
Journal Title
Bioinformatics
Copyright Statement
© 2021 Oxford University Press. This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version De Novo protein design by an energy function based on series expansion in distance and orientation dependence, Bioinformatics, 2021 is available online at: https://doi.org/10.1093/bioinformatics/btab598.
Note
This publication has been entered in Griffith Research Online as an advanced online version.
Subject
Mathematical sciences
Biological sciences
Information and computing sciences