npInv: accurate detection and genotyping of inversions using long read sub-alignment.

BMC Bioinformatics

Genomics of Development and Disease Division, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, 4067, Australia.

Published: July 2018

Background: Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored.

Result: We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats.

Conclusion: The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6044046PMC
http://dx.doi.org/10.1186/s12859-018-2252-9DOI Listing

Publication Analysis

Top Keywords

long read
12
read sub-alignment
8
nahr inversions
8
simulation real
8
real data
8
inversions
6
npinv
5
npinv accurate
4
accurate detection
4
detection genotyping
4

Similar Publications

Background: A subset of developmental disorders (DD) is characterized by disease-specific genome-wide methylation changes. These episignatures inform on the underlying pathogenic mechanisms and can be used to assess the pathogenicity of genomic variants as well as confirm clinical diagnoses. Currently, the detection of these episignature requires the use of indirect methylation profiling methodologies.

View Article and Find Full Text PDF

Background: Chromosomal inversions are underappreciated causes of rare diseases given their detection, resolution, and clinical interpretation remain challenging. Heterozygous mutations in the MEIS2 gene cause an autosomal dominant syndrome characterized by intellectual disability, cleft palate, congenital heart defect, and facial dysmorphism at variable severity and penetrance.

Case Presentation: Herein, we report a Chinese girl with intellectual disability, developmental delay, and congenital heart defect, in whom G-banded karyotype analysis identified a de novo paracentric inversion 46,XX, inv(15)(q15q26.

View Article and Find Full Text PDF

Excessive total suspended matter (TSM) concentrations can exert a considerable impact on the growth of aquatic organisms in fishponds, representing a significant risk to aquaculture health. This study revised existing unified models using empirical data to develop an optimized TSM retrieval model tailored for the Guangdong-Hong Kong-Macao Greater Bay Area (GBA) (R = 0.69, RMSE = 7.

View Article and Find Full Text PDF

Decoding Complexity: The Role of Long-Read Sequencing in Unraveling Genetic Disease Etiologies.

Mutat Res Rev Mutat Res

January 2025

State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing 100850, People's Republic of China; School of Life Sciences, Hebei University, No. 180 Wusi Dong Road, Lian Chi District, Baoding City, Hebei Province 071000, China. Electronic address:

In recent years, next-generation high-throughput sequencing technology has been widely used in clinical practice for the identification and diagnosis of Mendelian diseases as an auxiliary detection method. Nevertheless, due to the limitations in read length and poor coverage of complex genomic regions, the etiology of many genetic diseases is unclear. Long-read sequencing (LRS) addresses these limitations of next-generation sequencing.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!