Protein structure prediction from inaccurate and sparse NMR data using an enhanced genetic algorithm

Loading...
Thumbnail Image
File version
Author(s)
Islam, Md Lisul
Shatabda, Swakkhar
Rashid, Mahmood A
Khan, MGM
Rahman, M Sohel
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2019
Size
File type(s)
Location
Abstract

Nuclear Magnetic Resonance Spectroscopy (most commonly known as NMR Spectroscopy) is used to generate approximate and partial distances between pairs of atoms of the native structure of a protein. To predict protein structure from these partial distances by solving the Euclidean distance geometry problem from the partial distances obtained from NMR Spectroscopy, we can predict three-dimensional (3D) structure of a protein. In this paper, a new genetic algorithm is proposed to efficiently address the Euclidean distance geometry problem towards building 3D structure of a given protein applying NMR's sparse data. Our genetic algorithm uses (i) a greedy mutation and crossover operator to intensify the search; (ii) a twin removal technique for diversification in the population; (iii) a random restart method to recover from stagnation; and (iv) a compaction factor to reduce the search space. Reducing the search space drastically, our approach improves the quality of the search. We tested our algorithms on a set of standard benchmarks. Experimentally, we show that our enhanced genetic algorithms significantly outperforms the traditional genetic algorithms and a previously proposed state-of-the-art method. Our method is capable of producing structures that are very close to the native structures and hence, the experimental biologists could adopt it to determine more accurate protein structures from NMR data.

Journal Title

Computational Biology and Chemistry

Conference Title
Book Title
Edition
Volume

79

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2019 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.

Item Access Status
Note
Access the data
Related item(s)
Subject

Chemical sciences

Biological sciences

Protein structure prediction

Sparse data

Molecular distance geometry

Nuclear magnetic resonance spectroscopy

Genetic algorithms

Persistent link to this record
Citation
Collections