Publications by authors named "Michael Bernhofer"

The availability of accurate and fast artificial intelligence (AI) solutions predicting aspects of proteins are revolutionizing experimental and computational molecular biology. The webserver LambdaPP aspires to supersede PredictProtein, the first internet server making AI protein predictions available in 1992. Given a protein sequence as input, LambdaPP provides easily accessible visualizations of protein 3D structure, along with predictions at the protein level (GeneOntology, subcellular location), and the residue level (binding to metal ions, small molecules, and nucleotides; conservation; intrinsic disorder; secondary structure; alpha-helical and beta-barrel transmembrane segments; signal-peptides; variant effect) in seconds.

View Article and Find Full Text PDF

Background: Despite the immense importance of transmembrane proteins (TMP) for molecular biology and medicine, experimental 3D structures for TMPs remain about 4-5 times underrepresented compared to non-TMPs. Today's top methods such as AlphaFold2 accurately predict 3D structures for many TMPs, but annotating transmembrane regions remains a limiting step for proteome-wide predictions.

Results: Here, we present TMbed, a novel method inputting embeddings from protein Language Models (pLMs, here ProtT5), to predict for each residue one of four classes: transmembrane helix (TMH), transmembrane strand (TMB), signal peptide, or other.

View Article and Find Full Text PDF

The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences.

View Article and Find Full Text PDF

Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions.

View Article and Find Full Text PDF

The intricate details of how proteins bind to proteins, DNA, and RNA are crucial for the understanding of almost all biological processes. Disease-causing sequence variants often affect binding residues. Here, we described a new, comprehensive system of in silico methods that take only protein sequence as input to predict binding of protein to DNA, RNA, and other proteins.

View Article and Find Full Text PDF

Purpose: In soft tissue sarcoma (STS) patients systemic progression and survival remain comparably low despite low local recurrence rates. In this work, we investigated whether quantitative imaging features ("radiomics") of radiotherapy planning CT-scans carry a prognostic value for pre-therapeutic risk assessment.

Methods: CT-scans, tumor grade, and clinical information were collected from three independent retrospective cohorts of 83 (TUM), 87 (UW) and 51 (McGill) STS patients, respectively.

View Article and Find Full Text PDF

Background: For Glioblastoma (GBM), various prognostic nomograms have been proposed. This study aims to evaluate machine learning models to predict patients' overall survival (OS) and progression-free survival (PFS) on the basis of clinical, pathological, semantic MRI-based, and FET-PET/CT-derived information. Finally, the value of adding treatment features was evaluated.

View Article and Find Full Text PDF

Motivation: Many applications monitor predictions of a whole range of features for biological datasets, e.g. the fraction of secreted human proteins in the human proteome.

View Article and Find Full Text PDF

Purpose: Noticing the fast growing translation of artificial intelligence (AI) technologies to medical image analysis this paper emphasizes the future role of the medical physicist in this evolving field. Specific challenges are addressed when implementing big data concepts with high-throughput image data processing like radiomics and machine learning in a radiooncology environment to support clinical decisions.

Methods: Based on the experience of our interdisciplinary radiomics working group, techniques for processing minable data, extracting radiomics features and associating this information with clinical, physical and biological data for the development of prediction models are described.

View Article and Find Full Text PDF

Background And Purpose: Current prognostic models for soft tissue sarcoma (STS) patients are solely based on staging information. Treatment-related data have not been included to date. Including such information, however, could help to improve these models.

View Article and Find Full Text PDF

NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES.

View Article and Find Full Text PDF

In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure.

View Article and Find Full Text PDF

Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains.

View Article and Find Full Text PDF

Experimental structure determination continues to be challenging for membrane proteins. Computational prediction methods are therefore needed and widely used to supplement experimental data. Here, we re-examined the state of the art in transmembrane helix prediction based on a nonredundant dataset with 190 high-resolution structures.

View Article and Find Full Text PDF

The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction.

View Article and Find Full Text PDF

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2).

View Article and Find Full Text PDF