Efficient and Diverse De Novo Protein Backbone Design with SE(3)-Equivariant Diffusion
File version
Author(s)
Yang, M
Li, Y
Zheng, X
Liew, AWC
Pan, S
Guo, Y
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Sydney, Australia
License
Abstract
AI-assistant de novo protein design aims to efficiently explore the vast protein structure space and generate rationally designable protein. Despite the impressive performance of generative models like denoising diffusion models, customizing them specifically to de novo protein design remains challenging, including (a) ensuring simple and effective modeling of structures; (b) enabling reasonable expression of protein sequences; and (c) capturing complex relationships among sequences. To address these challenges, we propose a novel de novo Protein backbone design model with SE(3)-Equivariant Diffusion, dubbed ProSEED, to enable efficient and diverse protein structure design. Specifically, ProSEED contains three submodules: (1) a residue information interaction network; (2) SE(3)-equivariant neural network; and (3) an improved denoising diffusion probabilistic model, to precisely learn and capture the distribution of various protein backbone structures. By sampling from Gaussian noise, ProSEED could generate new protein backbones through reverse diffusion on residue backbones. Extensive experiments show that ProSEED achieves high protein structure designability, diversity, and novelty while requiring less data. These results highlight ProSEED’s exceptional ability to comprehend protein structures and enable efficient information transfer and interaction between its modules, paving the way for successful and innovative protein design with a high success rate.
Journal Title
Conference Title
Advances in Knowledge Discovery and Data Mining 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, NSW, Australia, June 10–13, 2025, Proceedings, Part III
Book Title
Edition
Volume
15872
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Proteomics and metabolomics
Medical biochemistry - proteins and peptides (incl. medical proteomics)
Information and computing sciences
Persistent link to this record
Citation
Zhou, R; Yang, M; Li, Y; Zheng, X; Liew, AWC; Pan, S; Guo, Y, Efficient and Diverse De Novo Protein Backbone Design with SE(3)-Equivariant Diffusion, Advances in Knowledge Discovery and Data Mining 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, NSW, Australia, June 10–13, 2025, Proceedings, Part III, 2025, 15872, pp. 408-420