Publications by B Rost

Publications by authors named "B Rost"

Page 1 of 80

Teaching AI to speak protein.

Curr Opin Struct Biol

April 2025

Large Language Models for proteins, namely protein Language Models (pLMs), have begun to provide an important alternative to capturing the information encoded in a protein sequence in computers. Arguably, pLMs have advanced importantly to understanding aspects of the language of life as written in proteins, and through this understanding, they are becoming an increasingly powerful means of advancing protein prediction, e.g.

View Article and Find Full Text PDF

Are protein language models the new universal key?

Konstantin Weissenow Burkhard Rost

Curr Opin Struct Biol

April 2025

Protein language models (pLMs) capture some aspects of the grammar of the language of life as written in protein sequences. The so-called pLM embeddings implicitly contain this information. Therefore, embeddings can serve as the exclusive input into downstream supervised methods for protein prediction.

View Article and Find Full Text PDF

Bilingual language model for protein sequence and structure.

Michael Heinzinger Konstantin Weissenow Joaquin Gomez Sanchez Adrian Henkel Milot Mirdita Burkhard Rost

NAR Genom Bioinform

December 2024

Adapting language models to protein sequences spawned the development of powerful protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein structure prediction. Now we can systematically and comprehensively explore the dual nature of proteins that act and exist as three-dimensional (3D) machines and evolve as linear strings of one-dimensional (1D) sequences.

View Article and Find Full Text PDF

Expert-guided protein language models enable accurate and blazingly fast fitness prediction.

Céline Marquet Julius Schlensok Marina Abakarova Burkhard Rost Elodie Laine

Bioinformatics

November 2024

Motivation: Exhaustive experimental annotation of the effect of all known protein variants remains daunting and expensive, stressing the need for scalable effect predictions. We introduce VespaG, a blazingly fast missense amino acid variant effect predictor, leveraging protein language model (pLM) embeddings as input to a minimal deep learning model.

Results: To overcome the sparsity of experimental training data, we created a dataset of 39 million single amino acid variants from the human proteome applying the multiple sequence alignment-based effect predictor GEMME as a pseudo standard-of-truth.

View Article and Find Full Text PDF

Structural analysis of genomic and proteomic signatures reveal dynamic expression of intrinsically disordered regions in breast cancer.

Nicole Zatorski Yifei Sun Abdulkadir Elmas Christian Dallago Timothy Karl Burkhard Rost

iScience

September 2024

Article Synopsis

The study introduces a new method called SAGES, which combines gene expression data with structural features of proteins to better understand protein evolution and function.
Using SAGES and machine learning, researchers analyzed tissue samples from healthy individuals and breast cancer patients, focusing on gene expression and protein profiles.
Key findings include the detection of intrinsically disordered regions in breast cancer proteins and potential links between drug responses and cancer signatures, indicating SAGES' broad applicability for studying biological processes.

View Article and Find Full Text PDF