icon-symbol-logout-darkest-grey
Diese Seite ist nur auf Englisch verfügbar.

Funded Project (2022-2024)Designing Proteins via Machine-Learned Physical Simulation

Proteins, such as enzymes, have evolved to efficiently perform precise cellular tasks. Their efficiency and specificity makes them promising in industrial applications. Unfortunately, naturally occurring proteins are sensitive to their chemical and thermodynamic environment, and the settings typical to industrial application often make their efficacy plummet.

The shortcomings typical of naturally occurring proteins do not mean that non-natural and yet undiscovered proteins cannot efficiently function in extreme environments; however, the design space of possible proteins is massive (>20^50), making it difficult to locate promising candidates given a particular application. Protein design is the task of finding these novel optimal proteins.
 

The abundance of data connecting existing proteins to their properties has motivated a surge in machine-learned (ML) strategies in protein design: algorithms are trained on known proteins and used to predict the behavior of their yet unseen counterparts. However, specialized domains have struggled to use these approaches due to limited reference data for training. Traditional physics-based computation (e.g., atomistic molecular dynamics) seems to represent a promising alternative as it requires less calibration data, but is too computationally expensive for exploring many possible protein candidates.

A diagram showing the atomistic representation of a miniprotein typical to physics based simulation on the left, and a coarse-grained representation used by physics-aware machine-learned models on the right.