Adapting language models to protein sequences spawned the development of powerful protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein structure prediction. Now we can systematically and comprehensively explore the dual nature of proteins that act and exist as three-dimensional (3D) machines and evolve as linear strings of one-dimensional (1D) sequences.
View Article and Find Full Text PDFMotivation: Exhaustive experimental annotation of the effect of all known protein variants remains daunting and expensive, stressing the need for scalable effect predictions. We introduce VespaG, a blazingly fast missense amino acid variant effect predictor, leveraging protein language model (pLM) embeddings as input to a minimal deep learning model.
Results: To overcome the sparsity of experimental training data, we created a dataset of 39 million single amino acid variants from the human proteome applying the multiple sequence alignment-based effect predictor GEMME as a pseudo standard-of-truth.
The biophysical characterization and engineering of optogenetic tools and photobiological systems has been hampered by the lack of efficient methods for spectral illumination of microplates for high-throughput analysis of action spectra. Current methods to determine action spectra only allow the sequential spectral illumination of individual wells. Here we present the open-source RainbowCap-system, which combines LEDs and optical filters in a standard 96-well microplate format for simultaneous and spectrally defined illumination.
View Article and Find Full Text PDFEmbeddings from protein Language Models (pLMs) are replacing evolutionary information from multiple sequence alignments (MSAs) as the most successful input for protein prediction. Is this because embeddings capture evolutionary information? We tested various approaches to explicitly incorporate evolutionary information into embeddings on various protein prediction tasks. While older pLMs (SeqVec, ProtBert) significantly improved through MSAs, the more recent pLM ProtT5 did not benefit.
View Article and Find Full Text PDFPhotosynthesis is one of the most important biological processes on Earth, providing the main source of bioavailable energy, carbon, and oxygen via the use of sunlight. Despite this importance, the minimum light level sustaining photosynthesis and net growth of primary producers in the global ocean is still unknown. Here, we present measurements from the MOSAiC field campaign in the central Arctic Ocean that reveal the resumption of photosynthetic growth and algal biomass buildup under the ice pack at a daily average irradiance of not more than 0.
View Article and Find Full Text PDFPrediction methods inputting embeddings from protein language models have reached or even surpassed state-of-the-art performance on many protein prediction tasks. In natural language processing fine-tuning large language models has become the de facto standard. In contrast, most protein language model-based protein predictions do not back-propagate to the language model.
View Article and Find Full Text PDFRegular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction.
View Article and Find Full Text PDFThe identification of protein binding residues helps to understand their biological processes as protein function is often defined through ligand binding, such as to other proteins, small molecules, ions, or nucleotides. Methods predicting binding residues often err for intrinsically disordered proteins or regions (IDPs/IDPRs), often also referred to as molecular recognition features (MoRFs). Here, we presented a novel machine learning (ML) model trained to specifically predict binding regions in IDPRs.
View Article and Find Full Text PDFFrom over to , the recent decade of exponential advances in artificial intelligence (AI) has been altering life. In parallel, advances in computational biology are beginning to decode the language of life: leaped forward in protein structure prediction, and protein language models (pLMs) replaced expertise and evolutionary information from multiple sequence alignments with information learned from reoccurring patterns in databases of billions of proteins without experimental annotations other than the amino acid sequences. None of those tools could have been developed 10 years ago; all will increase the wealth of experimental data and speed up the cycle from idea to proof.
View Article and Find Full Text PDFMarine heatwaves are increasing in frequency and intensity as climate change progresses, especially in the highly productive Arctic regions. Although their effects on primary producers will largely determine the impacts on ecosystem services, mechanistic understanding on phytoplankton responses to these extreme events is still very limited. We experimentally exposed Arctic phytoplankton assemblages to stable warming, as well as to repeated heatwaves, and measured temporally resolved productivity, physiology, and composition.
View Article and Find Full Text PDFWe propose and demonstrate a unified hierarchical method to measure n-point correlation functions that can be applied to driven, dissipative, or otherwise open or nonequilibrium quantum systems. In this method, the time evolution of the system is repeatedly interrupted by interacting an ancilla qubit with the system through a controlled operation, and measuring the ancilla immediately afterward. We discuss the robustness of this method as compared to other ancilla-based interferometric techniques (such as the Hadamard test), and highlight its advantages for near-term quantum simulations of open quantum systems.
View Article and Find Full Text PDFThe complex neuromuscular network that controls body movements is the target of severe diseases that result in paralysis and death. Here, we report the development of a robust and efficient self-organizing neuromuscular junction (soNMJ) model from human pluripotent stem cells that can be maintained long-term in simple adherent conditions. The timely application of specific patterning signals instructs the simultaneous development and differentiation of position-specific brachial spinal neurons, skeletal muscles, and terminal Schwann cells.
View Article and Find Full Text PDFBackground: The success of AlphaFold2 in reliable protein three-dimensional (3D) structure prediction, assists the move of structural biology toward studies of protein dynamics and mutational impact on structure and function. This transition needs tools that qualitatively assess alternative 3D conformations.
Results: We introduce MutAmore, a bioinformatics tool that renders individual images of protein 3D structures for, e.
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2.
View Article and Find Full Text PDFPatients suffering from painful spinal bone metastases (PSBMs) often undergo palliative radiation therapy (RT), with an efficacy of approximately two thirds of patients. In this exploratory investigation, we assessed the effectiveness of machine learning (ML) models trained on radiomics, semantic and clinical features to estimate complete pain response. Gross tumour volumes (GTV) and clinical target volumes (CTV) of 261 PSBMs were segmented on planning computed tomography (CT) scans.
View Article and Find Full Text PDFThree-finger toxins (3FTXs) are a functionally diverse family of toxins, apparently unique to venoms of caenophidian snakes. Although the ancestral function of 3FTXs is antagonism of nicotinic acetylcholine receptors, redundancy conferred by the accumulation of duplicate genes has facilitated extensive neofunctionalization, such that derived members of the family interact with a range of targets. 3FTXs are members of the LY6/UPAR family, but their non-toxin ancestor remains unknown.
View Article and Find Full Text PDFInformation is transmitted between brain regions through the release of neurotransmitters from long-range projecting axons. Understanding how the activity of such long-range connections contributes to behavior requires efficient methods for reversibly manipulating their function. Chemogenetic and optogenetic tools, acting through endogenous G-protein coupled receptor (GPCRs) pathways, can be used to modulate synaptic transmission, but existing tools are limited in sensitivity, spatiotemporal precision, or spectral multiplexing capabilities.
View Article and Find Full Text PDFPhytoplankton growth is controlled by multiple environmental drivers, which are all modified by climate change. While numerous experimental studies identify interactive effects between drivers, large-scale ocean biogeochemistry models mostly account for growth responses to each driver separately and leave the results of these experimental multiple-driver studies largely unused. Here, we amend phytoplankton growth functions in a biogeochemical model by dual-driver interactions (CO and temperature, CO and light), based on data of a published meta-analysis on multiple-driver laboratory experiments.
View Article and Find Full Text PDFGrowth rates and other biomass traits of phytoplankton are strongly affected by temperature. We hypothesized that resulting phenotypes originate from deviating temperature sensitivities of underlying physiological processes. We used membrane-inlet mass spectrometry to assess photosynthetic and respiratory O and CO fluxes in response to abrupt temperature changes as well as after acclimation periods in the diatom Phaeodactylum tricornutum.
View Article and Find Full Text PDF