Publications by authors named "Rost B"

Adapting language models to protein sequences spawned the development of powerful protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein structure prediction. Now we can systematically and comprehensively explore the dual nature of proteins that act and exist as three-dimensional (3D) machines and evolve as linear strings of one-dimensional (1D) sequences.

View Article and Find Full Text PDF

Motivation: Exhaustive experimental annotation of the effect of all known protein variants remains daunting and expensive, stressing the need for scalable effect predictions. We introduce VespaG, a blazingly fast missense amino acid variant effect predictor, leveraging protein language model (pLM) embeddings as input to a minimal deep learning model.

Results: To overcome the sparsity of experimental training data, we created a dataset of 39 million single amino acid variants from the human proteome applying the multiple sequence alignment-based effect predictor GEMME as a pseudo standard-of-truth.

View Article and Find Full Text PDF
Article Synopsis
  • The study introduces a new method called SAGES, which combines gene expression data with structural features of proteins to better understand protein evolution and function.
  • Using SAGES and machine learning, researchers analyzed tissue samples from healthy individuals and breast cancer patients, focusing on gene expression and protein profiles.
  • Key findings include the detection of intrinsically disordered regions in breast cancer proteins and potential links between drug responses and cancer signatures, indicating SAGES' broad applicability for studying biological processes.
View Article and Find Full Text PDF

The biophysical characterization and engineering of optogenetic tools and photobiological systems has been hampered by the lack of efficient methods for spectral illumination of microplates for high-throughput analysis of action spectra. Current methods to determine action spectra only allow the sequential spectral illumination of individual wells. Here we present the open-source RainbowCap-system, which combines LEDs and optical filters in a standard 96-well microplate format for simultaneous and spectrally defined illumination.

View Article and Find Full Text PDF

Embeddings from protein Language Models (pLMs) are replacing evolutionary information from multiple sequence alignments (MSAs) as the most successful input for protein prediction. Is this because embeddings capture evolutionary information? We tested various approaches to explicitly incorporate evolutionary information into embeddings on various protein prediction tasks. While older pLMs (SeqVec, ProtBert) significantly improved through MSAs, the more recent pLM ProtT5 did not benefit.

View Article and Find Full Text PDF

Photosynthesis is one of the most important biological processes on Earth, providing the main source of bioavailable energy, carbon, and oxygen via the use of sunlight. Despite this importance, the minimum light level sustaining photosynthesis and net growth of primary producers in the global ocean is still unknown. Here, we present measurements from the MOSAiC field campaign in the central Arctic Ocean that reveal the resumption of photosynthetic growth and algal biomass buildup under the ice pack at a daily average irradiance of not more than 0.

View Article and Find Full Text PDF
Article Synopsis
  • Skin inflammation and conditions like moist epitheliolysis and edema are common acute side effects of breast radiotherapy (RT).
  • The study aimed to evaluate the effectiveness of tissue-derived radiomics features compared to total breast volume (TBV) in predicting these side effects.
  • The best predictive model used a LASSO classifier based on TBV, achieving an AUROC of 0.74, similar to the AUROC of 0.75 for TBV alone, with mammary tissue showing greater predictive power than glandular tissue.
View Article and Find Full Text PDF

Prediction methods inputting embeddings from protein language models have reached or even surpassed state-of-the-art performance on many protein prediction tasks. In natural language processing fine-tuning large language models has become the de facto standard. In contrast, most protein language model-based protein predictions do not back-propagate to the language model.

View Article and Find Full Text PDF
Article Synopsis
  • Mutations in transcription factors related to congenital central hypoventilation disorders lead to issues like severe hypoventilation and decreased sensitivity to high carbon dioxide levels in the blood.
  • The study identifies specific groups of medullary neurons, called dB2 neurons, that play key roles in various respiratory functions such as controlling tidal volumes and the body's response to elevated carbon dioxide.
  • The research highlights the importance of these dB2 neurons for proper neonatal breathing and survival, showing that dysfunction in these neurons may result in respiratory problems associated with congenital hypoventilation.
View Article and Find Full Text PDF

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction.

View Article and Find Full Text PDF

The identification of protein binding residues helps to understand their biological processes as protein function is often defined through ligand binding, such as to other proteins, small molecules, ions, or nucleotides. Methods predicting binding residues often err for intrinsically disordered proteins or regions (IDPs/IDPRs), often also referred to as molecular recognition features (MoRFs). Here, we presented a novel machine learning (ML) model trained to specifically predict binding regions in IDPRs.

View Article and Find Full Text PDF

From over to , the recent decade of exponential advances in artificial intelligence (AI) has been altering life. In parallel, advances in computational biology are beginning to decode the language of life: leaped forward in protein structure prediction, and protein language models (pLMs) replaced expertise and evolutionary information from multiple sequence alignments with information learned from reoccurring patterns in databases of billions of proteins without experimental annotations other than the amino acid sequences. None of those tools could have been developed 10 years ago; all will increase the wealth of experimental data and speed up the cycle from idea to proof.

View Article and Find Full Text PDF
Article Synopsis
  • Information in the brain is transmitted via neurotransmitters released from long-range axons, and understanding this activity is crucial for linking brain function to behavior.* -
  • Current chemogenetic and optogenetic tools for manipulating these connections have limitations in sensitivity and precision.* -
  • The study identifies the ciliary opsin from Platynereis dumerilii (PdCO) as a highly effective tool for optogenetics, allowing precise control and reversible loss-of-function experiments in mammalian neurons and enabling detailed mapping of brain circuits in live animals.*
View Article and Find Full Text PDF

Marine heatwaves are increasing in frequency and intensity as climate change progresses, especially in the highly productive Arctic regions. Although their effects on primary producers will largely determine the impacts on ecosystem services, mechanistic understanding on phytoplankton responses to these extreme events is still very limited. We experimentally exposed Arctic phytoplankton assemblages to stable warming, as well as to repeated heatwaves, and measured temporally resolved productivity, physiology, and composition.

View Article and Find Full Text PDF

We propose and demonstrate a unified hierarchical method to measure n-point correlation functions that can be applied to driven, dissipative, or otherwise open or nonequilibrium quantum systems. In this method, the time evolution of the system is repeatedly interrupted by interacting an ancilla qubit with the system through a controlled operation, and measuring the ancilla immediately afterward. We discuss the robustness of this method as compared to other ancilla-based interferometric techniques (such as the Hadamard test), and highlight its advantages for near-term quantum simulations of open quantum systems.

View Article and Find Full Text PDF

The complex neuromuscular network that controls body movements is the target of severe diseases that result in paralysis and death. Here, we report the development of a robust and efficient self-organizing neuromuscular junction (soNMJ) model from human pluripotent stem cells that can be maintained long-term in simple adherent conditions. The timely application of specific patterning signals instructs the simultaneous development and differentiation of position-specific brachial spinal neurons, skeletal muscles, and terminal Schwann cells.

View Article and Find Full Text PDF

Background: The success of AlphaFold2 in reliable protein three-dimensional (3D) structure prediction, assists the move of structural biology toward studies of protein dynamics and mutational impact on structure and function. This transition needs tools that qualitatively assess alternative 3D conformations.

Results: We introduce MutAmore, a bioinformatics tool that renders individual images of protein 3D structures for, e.

View Article and Find Full Text PDF

The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2.

View Article and Find Full Text PDF
Article Synopsis
  • Venoms are a great example of how similar traits can evolve independently in different animal groups, but there's limited research on toxin genes in most species, especially in hymenopteran insects like bees.
  • A study examined the origins of 11 toxin genes across 32 hymenopteran genomes, finding that most venom genes developed from single gene co-option and further diversified through gene duplication.
  • The research revealed that most venom genes are common to all hymenopterans, with only a few like melittin and anthophilin1 being exclusive to bees, suggesting these venom proteins existed before the significant diversification of this insect group.
View Article and Find Full Text PDF

Patients suffering from painful spinal bone metastases (PSBMs) often undergo palliative radiation therapy (RT), with an efficacy of approximately two thirds of patients. In this exploratory investigation, we assessed the effectiveness of machine learning (ML) models trained on radiomics, semantic and clinical features to estimate complete pain response. Gross tumour volumes (GTV) and clinical target volumes (CTV) of 261 PSBMs were segmented on planning computed tomography (CT) scans.

View Article and Find Full Text PDF

Three-finger toxins (3FTXs) are a functionally diverse family of toxins, apparently unique to venoms of caenophidian snakes. Although the ancestral function of 3FTXs is antagonism of nicotinic acetylcholine receptors, redundancy conferred by the accumulation of duplicate genes has facilitated extensive neofunctionalization, such that derived members of the family interact with a range of targets. 3FTXs are members of the LY6/UPAR family, but their non-toxin ancestor remains unknown.

View Article and Find Full Text PDF

Information is transmitted between brain regions through the release of neurotransmitters from long-range projecting axons. Understanding how the activity of such long-range connections contributes to behavior requires efficient methods for reversibly manipulating their function. Chemogenetic and optogenetic tools, acting through endogenous G-protein coupled receptor (GPCRs) pathways, can be used to modulate synaptic transmission, but existing tools are limited in sensitivity, spatiotemporal precision, or spectral multiplexing capabilities.

View Article and Find Full Text PDF

Phytoplankton growth is controlled by multiple environmental drivers, which are all modified by climate change. While numerous experimental studies identify interactive effects between drivers, large-scale ocean biogeochemistry models mostly account for growth responses to each driver separately and leave the results of these experimental multiple-driver studies largely unused. Here, we amend phytoplankton growth functions in a biogeochemical model by dual-driver interactions (CO and temperature, CO and light), based on data of a published meta-analysis on multiple-driver laboratory experiments.

View Article and Find Full Text PDF

Growth rates and other biomass traits of phytoplankton are strongly affected by temperature. We hypothesized that resulting phenotypes originate from deviating temperature sensitivities of underlying physiological processes. We used membrane-inlet mass spectrometry to assess photosynthetic and respiratory O and CO fluxes in response to abrupt temperature changes as well as after acclimation periods in the diatom Phaeodactylum tricornutum.

View Article and Find Full Text PDF