Sequence-Based Prediction of Protein-Carbohydrate Binding Sites Using Support Vector Machines

No Thumbnail Available
File version
Author(s)
Taherzadeh, Ghazaleh
Zhou, Yaoqi
Liew, Alan Wee-Chung
Yang, Yuedong
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2016
Size
File type(s)
Location
License
Abstract

Carbohydrate-binding proteins play significant roles in many diseases including cancer. Here, we established a machine-learning-based method (called sequence-based prediction of residue-level interaction sites of carbohydrates, SPRINT-CBH) to predict carbohydrate-binding sites in proteins using support vector machines (SVMs). We found that integrating evolution-derived sequence profiles with additional information on sequence and predicted solvent accessible surface area leads to a reasonably accurate, robust, and predictive method, with area under receiver operating characteristic curve (AUC) of 0.78 and 0.77 and Matthew’s correlation coefficient of 0.34 and 0.29, respectively for 10-fold cross validation and independent test without balancing binding and nonbinding residues. The quality of the method is further demonstrated by having statistically significantly more binding residues predicted for carbohydrate-binding proteins than presumptive nonbinding proteins in the human proteome, and by the bias of rare alleles toward predicted carbohydrate-binding sites for nonsynonymous mutations from the 1000 genome project. SPRINT-CBH is available as an online server at http://sparks-lab.org/server/SPRINT-CBH.

Journal Title

Journal of Chemical Information and Modeling

Conference Title
Book Title
Edition
Volume

56

Issue

10

Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Medicinal and biomolecular chemistry

Theoretical and computational chemistry

Theoretical and computational chemistry not elsewhere classified

Persistent link to this record
Citation
Collections