ProT-VAE: Protein Transformer Variational AutoEncoder for functional protein design

  • Emre Sevgen
  • , Joshua Moller
  • , Adrian Lange
  • , John Parker
  • , Sean Quigley
  • , Jeff Mayer
  • , Poonam Srivastava
  • , Sitaram Gayatri
  • , David Hosfield
  • , Clayton Dilks
  • , Claire Buchanan
  • , Thomas Speltz
  • , Maria Korshunova
  • , Micha Livne
  • , Michelle Gill
  • , Rama Ranganathan
  • , Anthony B. Costa
  • , Andrew L. Ferguson

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Deep generative models have demonstrated success in learning the protein sequence to function relationship and designing synthetic sequences with engineered functionality. We introduce the Protein Transformer Variational AutoEncoder (ProT-VAE) as an accurate, generative, fast, and transferable model for data-driven protein design that blends the merits of variational autoencoders to learn interpretable, low-dimensional latent embeddings for conditional sequence design with the expressive, alignment-free featurization offered by transformer-based protein language models. We implement the model using NVIDIA’s BioNeMo framework and validate its performance in retrospective functional prediction and prospective functional design. The model identifies a phenylalanine hydroxylase enzyme with 2.5× catalytic activity over wild-type, and a γ-carbonic anhydrase enzyme with a melting temperature elevation of ΔTm = +61 C relative to the most thermostable sequence reported to date and activity in 23% v/v methyl diethanolamine at pH 11.25 and 93 C corresponding to industrially relevant conditions for enzymatic carbon capture technologies. The ProT-VAE model presents a powerful and experimentally validated platform for machine learning-guided directed evolution campaigns to discover synthetic proteins with engineered function.

Original languageEnglish
Article numbere2408737122
JournalProceedings of the National Academy of Sciences of the United States of America
Volume122
Issue number41
DOIs
StatePublished - Oct 2025
Externally publishedYes

Keywords

  • generative modeling
  • protein design
  • protein language models
  • transformers
  • variational autoencoders

Fingerprint

Dive into the research topics of 'ProT-VAE: Protein Transformer Variational AutoEncoder for functional protein design'. Together they form a unique fingerprint.

Cite this