AI Article Synopsis

  • Transcriptomics is crucial for understanding the effects of genetic variants and diagnosing diseases, but the choice of genome build significantly affects these analyses.
  • In a study involving 386 rare disease and control samples, researchers found 2,800 genes showed different expression levels depending on whether they used hg19, hg38, or CHM13 genome builds.
  • The findings highlight the necessity of cross-referencing transcriptomic analyses with genome build data to improve diagnostic accuracy and robustness.

Article Abstract

Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network (UDN) and Genomics Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium. We identified 2,800 genes with build-dependent quantification across six routinely-collected biospecimens, including 1,391 protein-coding genes and 341 known rare disease genes. We further observed multiple genes that only have detectable expression in a subset of genome builds. Finally, we characterized how genome build impacts the detection of outlier transcriptomic events. Combined, we provide a database of genes impacted by build choice, and recommend that transcriptomics-guided analyses and diagnoses are cross-referenced with these data for robustness.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10802764PMC
http://dx.doi.org/10.1101/2024.01.11.24301165DOI Listing

Publication Analysis

Top Keywords

genome build
16
build impacts
12
rare disease
12
genome builds
8
build
5
genome
5
genes
5
impact genome
4
build rna-seq
4
rna-seq interpretation
4

Similar Publications

The increasing utilization of deep learning models in drug repositioning has proven to be highly efficient and effective. In this study, we employed an integrated deep-learning model followed by traditional drug screening approach to screen a library of FDA-approved drugs, aiming to identify novel inhibitors targeting the TNF-α converting enzyme (TACE). TACE, also known as ADAM17, plays a crucial role in the inflammatory response by converting pro-TNF-α to its active soluble form and cleaving other inflammatory mediators, making it a promising target for therapeutic intervention in diseases such as rheumatoid arthritis.

View Article and Find Full Text PDF

Is increased mutation driving genetic diversity in dogs within the Chornobyl exclusion zone?

PLoS One

December 2024

Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, United States of America.

Environmental contamination can have lasting impacts on surrounding communities, though the long-term impacts can be difficult to ascertain. The disaster at the Chornobyl Nuclear Power Plant in 1986 and subsequent remediation efforts resulted in contamination of the local environment with radioactive material, heavy metals, and additional environmental toxicants. Many of these are mutagenic in nature, and the full effect of these exposures on local flora and fauna has yet to be understood.

View Article and Find Full Text PDF

GeniePool 2.0: advancing variant analysis through CHM13-T2T, AlphaMissense, gnomAD V4 integration, and variant co-occurrence queries.

Database (Oxford)

December 2024

The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva 84105, Israel.

Originally developed to meet the challenges of genomic data deluge, GeniePool emerged as a pioneering platform, enabling efficient storage, accessibility, and analysis of vast genomic datasets, enabled due to its data lake architecture. Building on this foundation, GeniePool 2.0 advances genomic analysis through the integration of cutting-edge variant databases, such as CHM13-T2T, AlphaMissense, and gnomAD V4, coupled with the capability for variant co-occurrence queries.

View Article and Find Full Text PDF

The white poplar () is a dioecious woody plant with significant potential for the phytoremediation of soils. To realize this potential, it is necessary to utilize growth-promoting microorganisms. One potential source of such beneficial microorganisms is the rhizosphere community of wild-growing trees.

View Article and Find Full Text PDF

Polyethylene (PE) is the most-produced polyolefin, and consequently, it is the most widely found plastic waste worldwide. PE biodegradation is under study by applying different (micro)organisms in order to understand the biodegradative mechanism in the majority of microbes. This study aims to identify novel bacterial species with compelling metabolic potential and strategic genetic repertoires for PE biodegradation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!