Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities.

Matthew N Bainbridge Min Wang Yuanqing Wu Irene Newsham Donna M Muzny John L Jefferies Thomas J Albert Daniel L Burgess Richard A Gibbs

Genome Biol

Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.

Published: July 2011

Background: Enrichment of loci by DNA hybridization-capture, followed by high-throughput sequencing, is an important tool in modern genetics. Currently, the most common targets for enrichment are the protein coding exons represented by the consensus coding DNA sequence (CCDS). The CCDS, however, excludes many actual or computationally predicted coding exons present in other databases, such as RefSeq and Vega, and non-coding functional elements such as untranslated and regulatory regions. The number of variants per base pair (variant density) and our ability to interrogate regions outside of the CCDS regions is consequently less well understood.

Results: We examine capture sequence data from outside of the CCDS regions and find that extremes of GC content that are present in different subregions of the genome can reduce the local capture sequence coverage to less than 50% relative to the CCDS. This effect is due to biases inherent in both the Illumina and SOLiD sequencing platforms that are exacerbated by the capture process. Interestingly, for two subregion types, microRNA and predicted exons, the capture process yields higher than expected coverage when compared to whole genome sequencing. Lastly, we examine the variation present in non-CCDS regions and find that predicted exons, as well as exonic regions specific to RefSeq and Vega, show much higher variant densities than the CCDS.

Conclusions: We show that regions outside of the CCDS perform less efficiently in capture sequence experiments. Further, we show that the variant density in computationally predicted exons is more than 2.5-times higher than that observed in the CCDS.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3218830	PMC
http://dx.doi.org/10.1186/gb-2011-12-7-r68	DOI Listing

Publication Analysis

Top Keywords

capture sequence

predicted exons

consensus coding

coding dna

dna sequence

higher variant

variant densities

coding exons

computationally predicted

refseq vega

Similar Publications

AiGPro: a multi-tasks model for profiling of GPCRs for agonist and antagonist.

J Cheminform

January 2025

School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea.

Rahul Brahma Sunghyun Moon Jae-Min Shin Kwang-Hwi Cho

G protein-coupled receptors (GPCRs) play vital roles in various physiological processes, making them attractive drug discovery targets. Meanwhile, deep learning techniques have revolutionized drug discovery by facilitating efficient tools for expediting the identification and optimization of ligands. However, existing models for the GPCRs often focus on single-target or a small subset of GPCRs or employ binary classification, constraining their applicability for high throughput virtual screening.

View Article and Find Full Text PDF

Similar Publications

scSMD: a deep learning method for accurate clustering of single cells based on auto-encoder.

BMC Bioinformatics

January 2025

Department of Surgery, Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.

Xiaoxu Cui Renkai Wu Yinghao Liu Peizhan Chen Qing Chang

Background: Single-cell RNA sequencing (scRNA-seq) has transformed biological research by offering new insights into cellular heterogeneity, developmental processes, and disease mechanisms. As scRNA-seq technology advances, its role in modern biology has become increasingly vital. This study explores the application of deep learning to single-cell data clustering, with a particular focus on managing sparse, high-dimensional data.

View Article and Find Full Text PDF

Similar Publications

Expanding the human gut microbiome atlas of Africa.

Nature

January 2025

Sydney Brenner Institute for Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa.

Dylan G Maghini Ovokeraye H Oduaran Luicer A Ingasia Olubayo Jane A Cook Natalie Smyth

Population studies provide insights into the interplay between the gut microbiome and geographical, lifestyle, genetic and environmental factors. However, low- and middle-income countries, in which approximately 84% of the world's population lives, are not equitably represented in large-scale gut microbiome research. Here we present the AWI-Gen 2 Microbiome Project, a cross-sectional gut microbiome study sampling 1,801 women from Burkina Faso, Ghana, Kenya and South Africa.

View Article and Find Full Text PDF

Similar Publications

A hybrid deep learning air pollution prediction approach based on neighborhood selection and spatio-temporal attention.

Sci Rep

January 2025

School of Architecture and Urban Planning, Beijing University of Civil Engineering and Architecture, Beijing, 100055, China.

Gang Chen Shen Chen Dong Li Cai Chen

Air pollution is a critical global environmental issue, further exacerbated by rapid industrialization and urbanization. Accurate prediction of air pollutant concentrations is essential for effective pollution prevention and control measures. The complex nature of pollutant data is influenced by fluctuating meteorological conditions, diverse pollution sources, and propagation processes, underscores the crucial importance of the spatial and temporal feature extraction for accurately predicting air pollutant concentrations.

View Article and Find Full Text PDF

Similar Publications

New reporters for monitoring cellular NMD.

RNA

January 2025

University of Gothenburg Department of Chemistry and Molecular biology

Hanna Alalam Monika Safhauzer Per Sunnerhagen

Nonsense-mediated decay (NMD) is a eukaryotic surveillance pathway that controls degradation of cytoplasmic transcripts with aberrant features. NMD-controlled RNA degradation acts to regulate a large fraction of the mRNA population. It has been implicated in cellular responses to infections and environmental stress, as well as in deregulation of tumor-promoting genes.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!