Innovative and easy-to-implement strategies are needed to improve the pathogenicity assessment of rare germline missense variants. Somatic cancer driver mutations identified through large-scale tumor sequencing studies often impact genes that are also associated with rare Mendelian disorders. The use of cancer mutation data to aid in the interpretation of germline missense variants, regardless of whether the gene is associated with a hereditary cancer predisposition syndrome or a non-cancer-related developmental disorder, has not been systematically assessed.
View Article and Find Full Text PDFComputational variant effect predictors (VEPs) are providing increasingly strong evidence to classify the pathogenicity of missense variants. Precision vs. recall analysis is useful in evaluating VEP performance, especially when adjusted for imbalanced test sets.
View Article and Find Full Text PDFThe tumor suppressor CHEK2 encodes the serine/threonine protein kinase CHK2 which, upon DNA damage, is important for pausing the cell cycle, initiating DNA repair, and inducing apoptosis. CHK2 phosphorylation of the tumor suppressor BRCA1 is also important for mitotic spindle assembly and chromosomal stability. Consistent with its cell-cycle checkpoint role, both germline and somatic variants in CHEK2 have been linked to breast and other cancers.
View Article and Find Full Text PDFUnlabelled: uses over 300 translocated effector proteins to rewire host cells during infection and create a replicative niche for intracellular growth. To date, several studies have identified effectors that indirectly and directly regulate the activity of other effectors, providing an additional layer of regulatory complexity. Among these are "metaeffectors," a special class of effectors that regulate the activity of other effectors once inside the host.
View Article and Find Full Text PDFMalate is an important dicarboxylic acid produced from fumarate in the tricarboxylic acid cycle. Deficiencies of fumarate hydrolase (FH) and malate dehydrogenase (MDH), responsible for malate formation and metabolism, respectively, are known to cause recessive forms of neurodevelopmental disorders (NDDs). The malic enzyme isoforms, malic enzyme 1 (ME1) and 2 (ME2), are required for the conversion of malate to pyruvate.
View Article and Find Full Text PDFBackground: Long QT syndrome is a lethal arrhythmia syndrome, frequently caused by rare loss-of-function variants in the potassium channel encoded by . Variant classification is difficult, often because of lack of functional data. Moreover, variant-based risk stratification is also complicated by heterogenous clinical data and incomplete penetrance.
View Article and Find Full Text PDFBackground: Computational variant effect predictors offer a scalable and increasingly reliable means of interpreting human genetic variation, but concerns of circularity and bias have limited previous methods for evaluating and comparing predictors. Population-level cohorts of genotyped and phenotyped participants that have not been used in predictor training can facilitate an unbiased benchmarking of available methods. Using a curated set of human gene-trait associations with a reported rare-variant burden association, we evaluate the correlations of 24 computational variant effect predictors with associated human traits in the UK Biobank and All of Us cohorts.
View Article and Find Full Text PDFComputational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released to date, and there is tremendous variability in their underlying algorithms and outputs, and in the ways in which the methodologies and predictions are shared. This leads to considerable challenges for end users in knowing which VEPs to use and how to use them.
View Article and Find Full Text PDFMultiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.
View Article and Find Full Text PDFBackground: Amino acid substitutions can perturb protein activity in multiple ways. Understanding their mechanistic basis may pinpoint how residues contribute to protein function. Here, we characterize the mechanisms underlying variant effects in human glucokinase (GCK) variants, building on our previous comprehensive study on GCK variant activity.
View Article and Find Full Text PDFMotivation: Long-read sequencing technologies, an attractive solution for many applications, often suffer from higher error rates. Alignment of multiple reads can improve base-calling accuracy, but some applications, e.g.
View Article and Find Full Text PDFClinical classification of genomic variants identified on sequencing is often challenging, with many variants classified as Variants of Uncertain Significance (VUS) on account of insufficient evidence. Advances in sequencing and gene synthesis has made feasible multiplexed assays of variant effect (MAVEs), which quantify the functional impact of many thousands of genomic variants in a single experiment. These assays and the functional evidence they generate have the potential to empower more accurate clinical variant classification.
View Article and Find Full Text PDFBackground: Long QT syndrome (LQTS) is a lethal arrhythmia syndrome, frequently caused by rare loss-of-function variants in the potassium channel encoded by . Variant classification is difficult, often owing to lack of functional data. Moreover, variant-based risk stratification is also complicated by heterogenous clinical data and incomplete penetrance.
View Article and Find Full Text PDFWidespread sequencing has yielded thousands of missense variants predicted or confirmed as disease-causing. This creates a new bottleneck: determining the functional impact of each variant - largely a painstaking, customized process undertaken one or a few genes or variants at a time. Here, we established a high-throughput imaging platform to assay the impact of coding variation on protein localization, evaluating 3,547 missense variants of over 1,000 genes and phenotypes.
View Article and Find Full Text PDFDefects in hydroxymethylbilane synthase (HMBS) can cause acute intermittent porphyria (AIP), an acute neurological disease. Although sequencing-based diagnosis can be definitive, ∼⅓ of clinical HMBS variants are missense variants, and most clinically reported HMBS missense variants are designated as "variants of uncertain significance" (VUSs). Using saturation mutagenesis, en masse selection, and sequencing, we applied a multiplexed validated assay to both the erythroid-specific and ubiquitous isoforms of HMBS, obtaining confident functional impact scores for >84% of all possible amino acid substitutions.
View Article and Find Full Text PDFTo maintain genome integrity, cells must accurately duplicate their genome and repair DNA lesions when they occur. To uncover genes that suppress DNA damage in human cells, we undertook flow-cytometry-based CRISPR-Cas9 screens that monitored DNA damage. We identified 160 genes whose mutation caused spontaneous DNA damage, a list enriched in essential genes, highlighting the importance of genomic integrity for cellular fitness.
View Article and Find Full Text PDFMultiplexed Assays of Variant Effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines has led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.
View Article and Find Full Text PDFSequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact.
View Article and Find Full Text PDFAmino acid substitutions can perturb protein activity in multiple ways. Understanding their mechanistic basis may pinpoint how residues contribute to protein function. Here, we characterize the mechanisms of human glucokinase (GCK) variants, building on our previous comprehensive study on GCK variant activity.
View Article and Find Full Text PDFThe COVID-19 pandemic has catalyzed unprecedented scientific data and reagent sharing and collaboration, which enabled understanding the virology of the SARS-CoV-2 virus and vaccine development at record speed. The pandemic, however, has also raised awareness of the danger posed by the family of coronaviruses, of which 7 are known to infect humans and dozens have been identified in reservoir species, such as bats, rodents, or livestock. To facilitate understanding the commonalities and specifics of coronavirus infections and aspects of viral biology that determine their level of lethality to the human host, we have generated a collection of freely available clones encoding nearly all human coronavirus proteins known to date.
View Article and Find Full Text PDFBackground: Glucokinase (GCK) regulates insulin secretion to maintain appropriate blood glucose levels. Sequence variants can alter GCK activity to cause hyperinsulinemic hypoglycemia or hyperglycemia associated with GCK-maturity-onset diabetes of the young (GCK-MODY), collectively affecting up to 10 million people worldwide. Patients with GCK-MODY are frequently misdiagnosed and treated unnecessarily.
View Article and Find Full Text PDFThe impact of millions of individual genetic variants on molecular phenotypes in coding sequences remains unknown. Multiplexed assays of variant effect (MAVEs) are scalable methods to annotate relevant variants, but existing software lacks standardization, requires cumbersome configuration, and does not scale to large targets. We present satmut_utils as a flexible solution for simulation and variant quantification.
View Article and Find Full Text PDF