Efficient and Diverse De Novo Protein Backbone Design with SE(3)-Equivariant Diffusion

No Thumbnail Available
File version
Author(s)
Zhou, R
Yang, M
Li, Y
Zheng, X
Liew, AWC
Pan, S
Guo, Y
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2025
Size
File type(s)
Location

Sydney, Australia

License
Abstract

AI-assistant de novo protein design aims to efficiently explore the vast protein structure space and generate rationally designable protein. Despite the impressive performance of generative models like denoising diffusion models, customizing them specifically to de novo protein design remains challenging, including (a) ensuring simple and effective modeling of structures; (b) enabling reasonable expression of protein sequences; and (c) capturing complex relationships among sequences. To address these challenges, we propose a novel de novo Protein backbone design model with SE(3)-Equivariant Diffusion, dubbed ProSEED, to enable efficient and diverse protein structure design. Specifically, ProSEED contains three submodules: (1) a residue information interaction network; (2) SE(3)-equivariant neural network; and (3) an improved denoising diffusion probabilistic model, to precisely learn and capture the distribution of various protein backbone structures. By sampling from Gaussian noise, ProSEED could generate new protein backbones through reverse diffusion on residue backbones. Extensive experiments show that ProSEED achieves high protein structure designability, diversity, and novelty while requiring less data. These results highlight ProSEED’s exceptional ability to comprehend protein structures and enable efficient information transfer and interaction between its modules, paving the way for successful and innovative protein design with a high success rate.

Journal Title
Conference Title

Advances in Knowledge Discovery and Data Mining 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, NSW, Australia, June 10–13, 2025, Proceedings, Part III

Book Title
Edition
Volume

15872

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Proteomics and metabolomics

Medical biochemistry - proteins and peptides (incl. medical proteomics)

Information and computing sciences

Persistent link to this record
Citation

Zhou, R; Yang, M; Li, Y; Zheng, X; Liew, AWC; Pan, S; Guo, Y, Efficient and Diverse De Novo Protein Backbone Design with SE(3)-Equivariant Diffusion, Advances in Knowledge Discovery and Data Mining 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, NSW, Australia, June 10–13, 2025, Proceedings, Part III, 2025, 15872, pp. 408-420