Ensembling methods for protein-ligand binding affinity prediction
File version
Version of Record (VoR)
Author(s)
Newton, MAH
Rahman, J
Mohamed Abdul Cader, AJ
Sattar, A
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Abstract
Protein-ligand binding affinity prediction is a key element of computer-aided drug discovery. Most of the existing deep learning methods for protein-ligand binding affinity prediction utilize single models and suffer from low accuracy and generalization capability. In this paper, we train 13 deep learning models from combinations of 5 input features. Then, we explore all possible ensembles of the trained models to find the best ensembles. Our deep learning models use cross-attention and self-attention layers to extract short and long-range interactions. Our method is named Ensemble Binding Affinity (EBA). EBA extracts information from various models using different combinations of input features, such as simple 1D sequential and structural features of the protein-ligand complexes rather than 3D complex features. EBA is implemented to accurately predict the binding affinity of a protein-ligand complex. One of our ensembles achieves the highest Pearson correlation coefficient (R) value of 0.914 and the lowest root mean square error (RMSE) value of 0.957 on the well-known benchmark test set CASF2016. Our ensembles show significant improvements of more than 15% in R-value and 19% in RMSE on both well-known benchmark CSAR-HiQ test sets over the second-best predictor named CAPLA. Furthermore, the superior performance of the ensembles across all metrics compared to existing state-of-the-art protein-ligand binding affinity prediction methods on all five benchmark test datasets demonstrates the effectiveness and robustness of our approach. Therefore, our approach to improving binding affinity prediction between proteins and ligands can contribute to improving the success rate of potential drugs and accelerate the drug development process.
Journal Title
Scientific Reports
Conference Title
Book Title
Edition
Volume
14
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
ARC
Grant identifier(s)
DP180102727
Rights Statement
Rights Statement
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Item Access Status
Note
Access the data
Related item(s)
Subject
Deep learning
Pharmacology and pharmaceutical sciences
Proteomics and metabolomics
Bioinformatics and computational biology
Persistent link to this record
Citation
Mohamed Abdul Cader, J; Newton, MAH; Rahman, J; Mohamed Abdul Cader, AJ; Sattar, A, Ensembling methods for protein-ligand binding affinity prediction, Scientific Reports, 2024, 14, pp. 24447