Publications by Celine Marquet

Publications by authors named "Celine Marquet"

Page 1 of 1

Meta-Learning Enables Complex Cluster-Specific Few-Shot Binding Affinity Prediction for Protein-Protein Interactions.

Yang Yue Yihua Cheng Céline Marquet Chenguang Xiao Jingjing Guo

J Chem Inf Model

January 2025

Predicting protein-protein interaction (PPI) binding affinities in unseen protein complex clusters is essential for elucidating complex protein interactions and for the targeted screening of peptide- or protein-based drugs. We introduce MCGLPPI++, a meta-learning framework designed to improve the adaptability of pretrained geometric models in such scenarios. To effectively boost the meta-learning optimization by injecting prior intersample distribution knowledge, three specially designed training sample cluster splitting patterns based on protein interaction interfaces are introduced.

View Article and Find Full Text PDF

Expert-guided protein language models enable accurate and blazingly fast fitness prediction.

Céline Marquet Julius Schlensok Marina Abakarova Burkhard Rost Elodie Laine

Bioinformatics

November 2024

Motivation: Exhaustive experimental annotation of the effect of all known protein variants remains daunting and expensive, stressing the need for scalable effect predictions. We introduce VespaG, a blazingly fast missense amino acid variant effect predictor, leveraging protein language model (pLM) embeddings as input to a minimal deep learning model.

Results: To overcome the sparsity of experimental training data, we created a dataset of 39 million single amino acid variants from the human proteome applying the multiple sequence alignment-based effect predictor GEMME as a pseudo standard-of-truth.

View Article and Find Full Text PDF

Critical assessment of missense variant effect predictors on disease-relevant variant data.

Ruchir Rastogi Ryan Chung Sindy Li Chang Li Kyoungyeul Lee Céline Marquet

bioRxiv

June 2024

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction.

View Article and Find Full Text PDF

Protein embeddings predict binding residues in disordered regions.

Laura R Jahn Céline Marquet Michael Heinzinger Burkhard Rost

Sci Rep

June 2024

The identification of protein binding residues helps to understand their biological processes as protein function is often defined through ligand binding, such as to other proteins, small molecules, ions, or nucleotides. Methods predicting binding residues often err for intrinsically disordered proteins or regions (IDPs/IDPRs), often also referred to as molecular recognition features (MoRFs). Here, we presented a novel machine learning (ML) model trained to specifically predict binding regions in IDPRs.

View Article and Find Full Text PDF

Alignment-based Protein Mutational Landscape Prediction: Doing More with Less.

Marina Abakarova Céline Marquet Michael Rera Burkhard Rost Elodie Laine

Genome Biol Evol

November 2023

The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2.

View Article and Find Full Text PDF

LambdaPP: Fast and accessible protein-specific phenotype predictions.

Tobias Olenyi Céline Marquet Michael Heinzinger Benjamin Kröger Tiha Nikolova

Protein Sci

January 2023

The availability of accurate and fast artificial intelligence (AI) solutions predicting aspects of proteins are revolutionizing experimental and computational molecular biology. The webserver LambdaPP aspires to supersede PredictProtein, the first internet server making AI protein predictions available in 1992. Given a protein sequence as input, LambdaPP provides easily accessible visualizations of protein 3D structure, along with predictions at the protein level (GeneOntology, subcellular location), and the residue level (binding to metal ions, small molecules, and nucleotides; conservation; intrinsic disorder; secondary structure; alpha-helical and beta-barrel transmembrane segments; signal-peptides; variant effect) in seconds.

View Article and Find Full Text PDF

Embeddings from protein language models predict conservation and variant effects.

Céline Marquet Michael Heinzinger Tobias Olenyi Christian Dallago Kyra Erckert

Hum Genet

October 2022

The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences.

View Article and Find Full Text PDF