Boosting variant-calling performance with multi-platform sequencing data using Clair3-MP.

BMC Bioinformatics

Department of Computer Science, The University of Hong Kong, Pok Fu Lam, Hong Kong SAR, China.

Published: August 2023

Background: With the continuous advances in third-generation sequencing technology and the increasing affordability of next-generation sequencing technology, sequencing data from different sequencing technology platforms is becoming more common. While numerous benchmarking studies have been conducted to compare variant-calling performance across different platforms and approaches, little attention has been paid to the potential of leveraging the strengths of different platforms to optimize overall performance, especially integrating Oxford Nanopore and Illumina sequencing data.

Results: We investigated the impact of multi-platform data on the performance of variant calling through carefully designed experiments with a deep learning-based variant caller named Clair3-MP (Multi-Platform). Through our research, we not only demonstrated the capability of ONT-Illumina data for improved variant calling, but also identified the optimal scenarios for utilizing ONT-Illumina data. In addition, we revealed that the improvement in variant calling using ONT-Illumina data comes from an improvement in difficult genomic regions, such as the large low-complexity regions and segmental and collapse duplication regions. Moreover, Clair3-MP can incorporate reference genome stratification information to achieve a small but measurable improvement in variant calling. Clair3-MP is accessible as an open-source project at: https://github.com/HKU-BAL/Clair3-MP .

Conclusions: These insights have important implications for researchers and practitioners alike, providing valuable guidance for improving the reliability and efficiency of genomic analysis in diverse applications.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10401749PMC
http://dx.doi.org/10.1186/s12859-023-05434-6DOI Listing

Publication Analysis

Top Keywords

variant calling
16
sequencing technology
12
ont-illumina data
12
variant-calling performance
8
sequencing data
8
improvement variant
8
sequencing
6
data
6
variant
5
boosting variant-calling
4

Similar Publications

Goats typically have double coats, with the outermost coarse hairs providing protection against mechanical and radiation damage. While much attention has been paid to cashmere due to its status as a high-end textile material, there is limited information available on coarse hair. This study aimed to identify genomic variants, such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels), associated with coarse hair diameter using a genome-wide association study (GWAS).

View Article and Find Full Text PDF

This paper proposes a detailed process for SV calling that permits a data-driven assessment of multiple SV callers that uses both genome assemblies and long-reads. The process is implemented as a software pipeline named Structural Variant - Jaccard Index Measure, or SVJIM, using the Snakemake [20] workflow management system. Like most state-of-the-art SV callers, SV-JIM detects the presence of variations between pairs of genomes, but it streamlines the numerous SV calling stages into a single process for user convenience and evaluates the multiple SV sets produced using the Jaccard index measure to identify those with the highest consistency among the included SV callers.

View Article and Find Full Text PDF

Background: This study aimed to develop and validate a targeted next-generation sequencing (NGS) panel along with a data analysis algorithm capable of detecting single-nucleotide variants (SNVs) and copy number variations (CNVs) within the beta-globin gene cluster. The aim was to reduce the turnaround time in conventional genotyping methods and provide a rapid and comprehensive solution for prenatal diagnosis, carrier screening, and genotyping of β-thalassemia patients.

Methods And Results: We devised a targeted NGS panel spanning an 80.

View Article and Find Full Text PDF

Protocol for mitochondrial variant enrichment from single-cell RNA sequencing using MAESTER.

STAR Protoc

January 2025

Division of Hematology, Brigham and Women's Hospital, Boston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA; Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA. Electronic address:

Single-cell RNA sequencing (scRNA-seq) enables detailed characterization of cell states but often lacks insights into tissue clonal structures. Here, we present a protocol to probe cell states and clonal information simultaneously by enriching mitochondrial DNA (mtDNA) variants from 3'-barcoded full-length cDNA. We describe steps for input library preparation, mtDNA enrichment, PCR product cleanup, and paired-end sequencing.

View Article and Find Full Text PDF

Background: Pacific Biosciences (PacBio) circular consensus sequencing (CCS), also known as high fidelity (HiFi) technology, has revolutionized modern genomics by producing long (10 + kb) and highly accurate reads. This is achieved by sequencing circularized DNA molecules multiple times and combining them into a consensus sequence. Currently, the accuracy and quality value estimation provided by HiFi technology are more than sufficient for applications such as genome assembly and germline variant calling.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!