Publications by authors named "Nadav Brandes"

PWAS (proteome-wide association study) is an innovative genetic association approach that complements widely used methods like GWAS (genome-wide association study). The PWAS approach involves consecutive phases. Initially, machine learning modeling and probabilistic considerations quantify the impact of genetic variants on protein-coding genes' biochemical functions.

View Article and Find Full Text PDF
Article Synopsis
  • * The research identified two main categories of harmful variants in the polycystin-1 protein: those that prevent it from reaching the cell surface and those that impair its ion channel activity.
  • * A small molecule was found to potentially rescue the surface localization of defective polycystin channels, suggesting that improving channel function through small-molecule therapies could be a promising treatment for ADPKD.
View Article and Find Full Text PDF
Article Synopsis
  • CRISPR-Cas9 genome editing is advancing T cell therapies, but there's a concern about the loss of targeted chromosomes, which could impact safety.
  • A study showed that chromosome loss is widespread in primary human T cells and can occur with both partial and complete chromosome loss, even in preclinical therapies.
  • The researchers developed a modified manufacturing process that reduces chromosome loss while maintaining the effectiveness of genome editing, finding that p53 expression might help protect against this issue in clinical applications.
View Article and Find Full Text PDF

Over three percent of people carry a dominant pathogenic variant, yet only a fraction of carriers develop disease. Disease phenotypes from carriers of variants in the same gene range from mild to severe. Here, we investigate underlying mechanisms for this heterogeneity: variable variant effect sizes, carrier polygenic backgrounds, and modulation of carrier effect by genetic background (marginal epistasis).

View Article and Find Full Text PDF

Predicting the effects of coding variants is a major challenge. While recent deep-learning models have improved variant effect prediction accuracy, they cannot analyze all coding variants due to dependency on close homologs or software limitations. Here we developed a workflow using ESM1b, a 650-million-parameter protein language model, to predict all ~450 million possible missense variant effects in the human genome, and made all predictions available on a web portal.

View Article and Find Full Text PDF
Article Synopsis
  • CRISPR-Cas9 genome editing has advanced T cell therapies but poses safety concerns due to chromosome loss during the editing process.
  • A systematic analysis showed that chromosome loss is a widespread issue in primary human T cells, impacting both partial and full chromosomes, including in engineered T cells for clinical use.
  • The study's modified manufacturing process significantly reduced chromosome loss while maintaining effectiveness, with the expression of the p53 protein offering potential protection against this genetic damage.
View Article and Find Full Text PDF

Genetic studies of human traits have revolutionized our understanding of the variation between individuals, and yet, the genetics of most traits is still poorly understood. In this review, we highlight the major open problems that need to be solved, and by discussing these challenges provide a primer to the field. We cover general issues such as population structure, epistasis and gene-environment interactions, data-related issues such as ancestry diversity and rare genetic variants, and specific challenges related to heritability estimates, genetic association studies, and polygenic risk scores.

View Article and Find Full Text PDF

Summary: Self-supervised deep language modeling has shown unprecedented success across natural language tasks, and has recently been repurposed to biological sequences. However, existing models and pretraining methods are designed and optimized for text analysis. We introduce ProteinBERT, a deep language model specifically designed for proteins.

View Article and Find Full Text PDF

Human genetic variation in coding regions is fundamental to the study of protein structure and function. Most methods for interpreting missense variants consider substitution measures derived from homologous proteins across different species. In this study, we introduce human-specific amino acid (AA) substitution matrices that are based on genetic variations in the modern human population.

View Article and Find Full Text PDF

The characterization of germline genetic variation affecting cancer risk, known as cancer predisposition, is fundamental to preventive and personalized medicine. Studies of genetic cancer predisposition typically identify significant genomic regions based on family-based cohorts or genome-wide association studies (GWAS). However, the results of such studies rarely provide biological insight or functional interpretation.

View Article and Find Full Text PDF

One of the major challenges in the post-genomic era is elucidating the genetic basis of human diseases. In recent years, studies have shown that polygenic risk scores (), based on aggregated information from millions of variants across the human genome, can estimate individual risk for common diseases. In practice, the current medical practice still predominantly relies on physiological and clinical indicators to assess personal disease risk.

View Article and Find Full Text PDF

Natural language processing (NLP) is a field of computer science concerned with automated text and language analysis. In recent years, following a series of breakthroughs in deep and machine learning, NLP methods have shown overwhelming progress. Here, we review the success, promise and pitfalls of applying NLP algorithms to the study of proteins.

View Article and Find Full Text PDF

Contemporary catalogues of cancer driver genes rely primarily on high mutation rates as evidence for gene selection in tumors. Here, we present The Functional Alteration Bias Recovery In Coding-regions Cancer Portal, a comprehensive catalogue of gene selection in cancer based purely on the biochemical functional effects of mutations at the protein level. Gene selection in the portal is quantified by combining genomics data with rich proteomic annotations.

View Article and Find Full Text PDF

We introduce Proteome-Wide Association Study (PWAS), a new method for detecting gene-phenotype associations mediated by protein function alterations. PWAS aggregates the signal of all variants jointly affecting a protein-coding gene and assesses their overall impact on the protein's function using machine learning and probabilistic models. Subsequently, it tests whether the gene exhibits functional variability between individuals that correlates with the phenotype of interest.

View Article and Find Full Text PDF

Background: In recent years, research on cancer predisposition germline variants has emerged as a prominent field. The identity of somatic mutations is based on a reliable mapping of the patient germline variants. In addition, the statistics of germline variants frequencies in healthy individuals and cancer patients is the basis for seeking candidates for cancer predisposition genes.

View Article and Find Full Text PDF

Compiling the catalogue of genes actively involved in cancer is an ongoing endeavor, with profound implications to the understanding and treatment of the disease. An abundance of computational methods have been developed to screening the genome for candidate driver genes based on genomic data of somatic mutations in tumors. Existing methods make many implicit and explicit assumptions about the distribution of random mutations.

View Article and Find Full Text PDF

Viruses are the most prevalent infectious agents, populating almost every ecosystem on earth. Most viruses carry only a handful of genes supporting their replication and the production of capsids. It came as a great surprise in 2003 when the first giant virus was discovered and found to have a >1 Mbp genome encoding almost a thousand proteins.

View Article and Find Full Text PDF

Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial similarity. Machine Learning (ML) methods are becoming fundamental in annotating unknown proteins and their heterogeneous properties.

View Article and Find Full Text PDF

Background: Viruses are the simplest replicating units, characterized by a limited number of coding genes and an exceptionally high rate of overlapping genes. We sought a unified evolutionary explanation that accounts for their genome sizes, gene overlapping and capsid properties.

Results: We performed an unbiased statistical analysis of ~100 families within ~400 genera that comprise the currently known viral world.

View Article and Find Full Text PDF