The 1000 Genomes Project (TGP) is a foundational resource that serves the biomedical community as a standard reference cohort for human genetic variation. There are now seven public versions of these genomes. The TGP Consortium produced the first by mapping its final data release against human reference sequence GRCh37, then "lifted over" these genomes to the improved reference sequence (GRCh38) when it was released, and remapped the original data to GRCh38 with two similar pipelines. As best-practice quality validation, the pipelines that generated these versions were benchmarked against the Genome In A Bottle Consortium's "platinum quality" genome (NA12878). The New York Genome Center recently released the results of independently resequencing the cohort at greater depth (30×), a phased version informed by the inclusion of related individuals, and independently remapped the original variant calls to GRCh38. We performed a cross-comparison evaluation of all seven versions using genome fingerprinting, which supports ultrafast genome comparison even across reference versions. We noted multiple issues, including discrepancies in cohort membership, disagreement on the overall level of variation, evidence of substandard pipeline performance on specific genomes and in specific regions of the genome, cryptic relationships between individuals, inconsistent phasing, and annotation distortions caused by the history of the reference genome itself. We therefore recommend global quality assessment by rapid genome comparisons, alongside benchmarking as part of best-practice quality assessment of large genome datasets. Our observations also help inform the decision of which version to use, to support analyses by individual researchers.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9250042 | PMC |
http://dx.doi.org/10.1016/j.xhgg.2022.100123 | DOI Listing |
Mol Plant Pathol
January 2025
Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, China.
Tomato yellow leaf curl Guangdong virus (TYLCGdV), a monopartite begomovirus first identified in 2004, remains poorly characterised. In this study, we demonstrate that TYLCGdV associates with a betasatellite, TYLCGdB, and the βC1 protein encoded by TYLCGdB is essential for symptom development. We also explore the role of TYLCGdV C4 protein by generating a C4-deficient infectious clone (TYLCGdV), revealing a dynamic role for TYLCGdV C4.
View Article and Find Full Text PDFHum Genomics
January 2025
Department of Endocrine and Metabolic Diseases, Children's Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China.
Background: The molecular genetic diagnosis of congenital adrenal hyperplasia (CAH) is very challenging due to the high homology between the CYP21A2 gene and its pseudogene CYP21A1P.
Methodology: This study aims to assess the clinical efficacy of targeted long-read sequencing (T-LRS) by comparing it with a control method based on the combined assay (NGS, Multiplex ligation-dependent probe amplification and Sanger sequencing) and to introduce T-LRS as a first-tier diagnostic test for suspected CAH patients to improve the precise diagnosis of CAH.
Results: A large cohort of 562 participants including 322 probands and 240 family members was enrolled for the perspective (96 probands) and prospective study (226 probands).
Trop Med Health
January 2025
Department of Vector Biology and Control of Diseases, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran.
Background: The Anopheles culicifacies complex is one of the most important malaria vectors in Southeast Asia and Southeastern Iran. Although the sibling species within this complex are morphologically indistinguishable, they differ significantly in their disease transmission potential, blood-feeding behaviour, and other biological traits. Cytogenetic and chromosomal studies have identified five sibling species within this complex: A, B, C, D, and E.
View Article and Find Full Text PDFMicrobiome
January 2025
Department of Microbiome Dynamics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Beutenbergstraße 11A, Jena, 07745, Germany.
Background: The pathogenesis of non-alcoholic fatty liver disease (NAFLD) with a global prevalence of 30% is multifactorial and the involvement of gut bacteria has been recently proposed. However, finding robust bacterial signatures of NAFLD has been a great challenge, mainly due to its co-occurrence with other metabolic diseases.
Results: Here, we collected public metagenomic data and integrated the taxonomy profiles with in silico generated community metabolic outputs, and detailed clinical data, of 1206 Chinese subjects w/wo metabolic diseases, including NAFLD (obese and lean), obesity, T2D, hypertension, and atherosclerosis.
Alzheimers Res Ther
January 2025
Department of Neuroscience "Rita Levi Montalcini", University of Turin, Via Cherasco 15, Turin, 10126, Italy.
Background: Alzheimer's disease (AD) is a progressive neurodegenerative disorder with both genetic and environmental factors contributing to its pathogenesis. While early-onset AD has well-established genetic determinants, the genetic basis for late-onset AD remains less clear. This study investigates a large Italian family with late-onset autosomal dominant AD, identifying a novel rare missense variant in GRIN2C gene associated with the disease, and evaluates the functional impact of this variant.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!