Background: Mutations arise in the human genome in two major settings: the germline and the soma. These settings involve different inheritance patterns, time scales, chromatin structures, and environmental exposures, all of which impact the resulting distribution of substitutions. Nonetheless, many of the same single nucleotide variants (SNVs) are shared between germline and somatic mutation databases, such as between the gnomAD database of 120,000 germline exomes and the TCGA database of 10,000 somatic exomes. Here, we sought to explain this overlap.

Results: After strict filtering to exclude common germline polymorphisms and sites with poor coverage or mappability, we found 336,987 variants shared between the somatic and germline databases. A uniform statistical model explains 34% of these shared variants; a model that incorporates the varying mutation rates of the basic mutation types explains another 50% of shared variants; and a model that includes extended nucleotide contexts (e.g. surrounding 3 bases on either side) explains an additional 4% of shared variants. Analysis of read depth finds mixed evidence that up to 4% of the shared variants may represent germline variants leaked into somatic call sets. 9% of the shared variants are not explained by any model. Sequencing errors and convergent evolution did not account for these. We surveyed other factors as well: Cancers driven by endogenous mutational processes share a greater fraction of variants with the germline, and recently derived germline variants were more likely to be somatically shared than were ancient germline ones.

Conclusions: Overall, we find that shared variants largely represent bona fide biological occurrences of the same variant in the germline and somatic setting and arise primarily because DNA has some of the same basic chemical vulnerabilities in either setting. Moreover, we find mixed evidence that somatic call-sets leak appreciable numbers of germline variants, which is relevant to genomic privacy regulations. In future studies, the similar chemical vulnerability of DNA between the somatic and germline settings might be used to help identify disease-related genes by guiding the development of background-mutation models that are informed by both somatic and germline patterns of variation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7273669PMC
http://dx.doi.org/10.1186/s12859-020-3508-8DOI Listing

Publication Analysis

Top Keywords

shared variants
24
somatic germline
16
germline
14
variants
13
germline variants
12
shared
10
somatic
9
variants shared
8
germline somatic
8
variants model
8

Similar Publications

Over 70 intragenic copy-number variations (CNVs) of PHEX have been identified in patients with X-linked hypophosphatemia (XLH). However, the underlying mechanism of these CNVs has been poorly investigated. Furthermore, although PHEX undergoes X chromosome inactivation (XCI), the association between XLH in women with heterozygous PHEX variants and skewed XCI remains unknown.

View Article and Find Full Text PDF

Various aggressive lymphomas entities have been associated with immunodeficiency. To provide further evidence that also MYC-negative high-grade B-cell (formerly Burkitt-like) lymphoma with 11q aberrations comprises an immunodeficiency-related subtype, we here conducted a comprehensive pathological and genetic workup of a 25-year-old patient with this type of lymphoma and simultaneous papillary renal cell carcinoma. The patient developed both malignancies following extensive childhood immunosuppression and a kidney transplant.

View Article and Find Full Text PDF

The practice of recontact in genomic medicine has the power to help rectify long-standing inequities in genetic testing. However, if not delivered systematically, recontacting practices also have the potential to reinforce these same inequities. Recontact, which occurs when contact between a clinician and patient is reinitiated after a relationship has ended, is often in search of or in response to updated interpretation or results.

View Article and Find Full Text PDF

RORA-neurodevelopmental disorder: a unique triad of developmental disability, cerebellar anomalies, and myoclonic seizures.

Genet Med

December 2024

Genetics Department, Hospices Civils de Lyon, Lyon, France; Neuromyogene Institute, Pathology and Genetics of neuron and muscle, CNRS UMR 5261 INSERM U1315, University of Lyon - Université Claude Bernard Lyon 1, Lyon, France. Electronic address:

Article Synopsis
  • RORA is a gene linked to the development and function of the cerebellum, and this study explores the largest group of individuals with RORA-related neurodevelopmental disorders (RORA-NDD).
  • The study involved 40 participants with various pathogenic variants of RORA, revealing a range of clinical features including developmental and intellectual disabilities, as well as cerebellar symptoms that can vary in onset and severity.
  • Findings indicate that certain missense variants are associated with more severe cerebellar issues, and common elements of RORA-NDD include developmental disabilities, cerebellar symptoms, and different types of myoclonic epilepsy.
View Article and Find Full Text PDF

Genetic and allelic heterogeneity in 248 Indians with skeletal dysplasia.

Eur J Hum Genet

December 2024

Department of Medical Genetics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India.

Skeletal dysplasias are a clinically and genetically heterogeneous group of rare disorders. Studies from large cohorts are essential to provide insights into the disease epidemiology, phenotypic spectrum, and mutational profiles. Here we enumerate additional 248 Indians from 197 families with a skeletal dysplasia, following a similar study earlier.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!