Molecular weight information from protein-ligand complexes to probe natural product interactions
File version
Author(s)
Primary Supervisor
Quinn, Ronald J
Zhou, Yaoqi
Other Supervisors
Almo, Steve
Loa-Kum-Cheung, Wendy
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
This project presented a strategy to investigate natural product binders of protein targets based on their molecular weights. The molecular weight of the binders was determined from non-covalent protein-natural product complexes, detected by native mass spectrometry screening of natural product libraries including extracts, fractions and pure compounds. Human (Homo sapiens) calcium binding protein S100A4 (apo and calcium bound forms), mouse (Mus musculus) T cell/transmembrane, Ig, and mucin (TIM) protein 3 and human T-cell immunoreceptor with Ig and ITIM domains precursor protein (TIGIT) were investigated in this project. For functional association analysis of the proteins, protein-protein interaction (PPI) networks were constructed. The PPI network of S100A4 demonstrated that it is an important target in different types of cancers, such as breast cancer, colorectal cancer, bladder cancer, esophageal cancer, non-small cell lung cancer, gastric cancer, medulloblastoma, pancreatic cancer, prostate cancer, thyroid cancer and colon cancer. It is also a potential target in osteoarthritis, rheumatoid arthritis, pannus formation and joint destruction. TIM3 has functional association in regulatory immune processes, such as regulation of autoimmunity and anti-tumour immunity. TIGIT is an important target in autoimmune diseases caused by viral, bacterial and protozoal infections, and macrophage-mediated inflammatory diseases. It is a good target for the development of immuno-oncology combination therapies. SiteMap program was used for structure-based identification of druggable binding sites in S100A4 (apo state), S100A4-Ca2+ (calcium bound state), TIM3 and TIGIT. SiteMap scoring function, Dscore defines ‘druggability’ as a quantitative estimation of binding sites. Based on Dscore, 6 druggable binding sites were identified in apo S100A4, 6 sites in calcium bound S100A4 (S100A4-Ca2+), 2 sites in TIM3 and 5 sites in TIGIT. The druggability of a binding site was estimated as the sum of contributions from the pocket enclosure, pocket size, and the balance between hydrophobic and hydrophilic character of the binding site. The results showed that size and enclosure of a pocket has direct and proportional correlation to druggability. However, the influence of pocket enclosure on druggability was less significant than that of pocket size. The druggability of a binding site was found highly correlated to hydrophobicity of the pockets. Electrospray ionization Fourier transform mass spectrometer (ESI-FTMS) was used for direct screening of natural products. Molecular weights of hits were determined from protein-natural product complexes. Based on molecular weight, natural product binders were classified in three chemical subspaces, such as drug-like compounds with molecular weight <500 Da (RO5), lead-like compounds with molecular weight <300 Da (RO3), and beyond the ‘rule of 5’ (bRO5) for the compounds with molecular weight > 500 Da. In this project, natural product extracts and fractions were screened against four proteins including S100A4, S100A4-Ca2+, TIM3 and TIGIT. Considering the molecular weight as identifier, natural products were categorised as unique hits (binding to one protein) and common hits (binding to more than one protein). A hit detected in multiple extracts or fractions could be the same compound or different compounds with the same mass. For classification, the hits with identical molecular weight, detected in different biota were considered as the same compound. In native MS screening of extracts, 75 unique hits were detected in 86 extracts, obtained from 42 genera. Some unique hits were detected in multiple extracts. Twelve unique hits were lead-like (MW<300 Da), 36 hits were drug-(MW<500 Da) and 41 hits were beyond the rule of five (MW>500 Da). Eight unique hits were detected binding to S100A4, 38 hits to S100A4-Ca2+, 22 hits to TIM3 to and 21 hits to TIGIT. Eighteen common hits were detected in 73 extracts, obtained from 37 genera. Among them, 1 common hit was lead-like, 6 hits were drug-like and 12 hits were beyond the rule of five. Ten common hits showed binding to S100A4, 11 hits to S100A4-Ca2+, 11 hits to TIM3 and 12 hits to TIGIT. In fraction screening, 46 hits were detected in the extracts, from 15 genera. The highest number of hits was detected in fraction 3 (16 hits) and the lowest number of hits was from fraction 5 (5 hits). Six hits were detected in fraction 1, 11 hits from fraction 2, and 9 hits were from fraction 4. Twenty-three unique hits were detected in 24 extracts, obtained from 11 genera. Among them, 1 unique hit was lead-like, 13 hits were druglike and 9 hits were beyond the rule of five. Four unique hits were detected to bind to S100A4, 4 hits to S100A4-Ca2+, 7 hits to TIM3 and 8 hits to TIGIT. Fourteen common hits were detected in 24 extracts, obtained from 11 genera. Among them, 4 common hits were drug-like and 10 hits were beyond the rule of five. The active extracts were analysed by liquid chromatography high resolution mass spectrometry. Common hits from different biota showed similar or the same retention time in a C18 column. Molecular formula analysis showed that a common hit from different biota possesses the same molecular formula. Following mass-guided isolation and NMR-guided structure elucidation, the common hits, NP_564, NP_358, NP_594, NP_376, NP_434, NP_592 and NP_610 were identified as apigenin 6-C--D-glucoside 8-C--L-arabinoside, sweroside, 4',5-dihroxy-7-methoxyflavanone-6-C-rutinoside, loganin acid, 6-C-glucosylnaringenin, biochanin A 7-O-rutinoside and quercetin 3-Orutinoside, respectively. Schrödinger extra precision docking, Glide-XP was used for molecular docking of protein-natural product complexes. Binding mode of the common hits in druggable binding sites of the proteins varied with the structural features of the compounds and binding pockets. Binding affinity of four flavonoid glycosides (structural analogues) to the proteins were estimated from the standard state free energy of complex formation. All of the compounds showed the highest affinity to TIGIT and the lowest affinity to S100A4-Ca2+. S100A4-Ca2+ and TIM3 showed similar affinity for the compounds. For rapid detection of selective natural product binders, a combined screening strategy including structure-based homology search by Spot-ligand 2 and native mass spectrometry by ESI-FTMS were used. NP_204 (2-(5-methoxy-1H-indole-3- yl)ethylamine) showed selective binding to S100A4, NP_217 (4-hydroxy-1- (1H-indol-3-yl)-3-methylbutan-1-one) to S100A4-Ca2+ and NP_162 (3-(1- methylpyrrolidin-2-yl)pyridine) to TIGIT.
Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Thesis (PhD Doctorate)
Degree Program
Doctor of Philosophy (PhD)
School
School of Natural Sciences
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
The author owns the copyright in this thesis, unless stated otherwise.
Item Access Status
Note
Access the data
Related item(s)
Subject
Molecular weight
Protein-ligand complexes
Natural product interactions