RawHash2: mapping raw nanopore signals using hash-based seeding and adaptive quantization.

Bioinformatics

Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland.

Published: August 2024

Summary: Raw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including more sensitive quantization and chaining algorithms, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and POD5 and SLOW5 file formats. Compared to RawHash, RawHash2 provides better F1 accuracy (on average by 10.57% and up to 20.25%) and better throughput (on average by 4.0× and up to 9.9×) than RawHash.

Availability And Implementation: RawHash2 is available at https://github.com/CMU-SAFARI/RawHash. We also provide the scripts to fully reproduce our results on our GitHub page.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11333567PMC
http://dx.doi.org/10.1093/bioinformatics/btae478DOI Listing

Publication Analysis

Top Keywords

raw nanopore
8
nanopore signals
8
real-time analysis
8
raw signals
8
rawhash2
4
rawhash2 mapping
4
raw
4
mapping raw
4
signals
4
signals hash-based
4

Similar Publications

Nanopore Decoding with Speed and Versatility for Data Storage.

Bioinformatics

January 2025

Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, USA.

Motivation: As nanopore technology reaches ever higher throughput and accuracy, it becomes an increasingly viable candidate for reading out DNA data storage. Nanopore sequencing offers considerable flexibility by allowing long reads, real-time signal analysis, and the ability to read both DNA and RNA. We need flexible and efficient designs that match nanopore's capabilities, but relatively few designs have been explored and many have significant inefficiency in read density, error rate, or compute time.

View Article and Find Full Text PDF

(previously known as ) infection is not common in Hong Kong. is a fish-borne cestode parasite that infects humans after consuming raw or insufficiently cooked fish containing plerocercoids. We reported a case of infection in a 40-year-old woman who presented with a complaint of epigastric pain and diarrhea.

View Article and Find Full Text PDF
Article Synopsis
  • The study of taxonomic composition has shifted from traditional methods to advanced DNA sequencing techniques, particularly metabarcoding, which uses targeted genome portions for high-throughput sequencing.
  • Recent innovations in Oxford Nanopore Technologies have made sequencing more accessible and effective while presenting specific errors and a need for refined bioinformatics tools to handle long-read data.
  • PRONAME, a new open-source pipeline designed for Nanopore data, enhances sequence accuracy and supports custom database integration, achieving over 99.5% accuracy in tests, thus providing a reliable method for analyzing complex biological communities.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers developed a radiative cooling textile (PAC@T) inspired by flamingo feathers, using polyacrylonitrile and alumina particles to enhance cooling and comfort.
  • PAC@T achieves high solar reflectance (95%) and mid-infrared emissivity (91.8%), resulting in effective cooling that is 6.1°C cooler than traditional textiles.
  • The textile is made from common materials and offers advantages like durability and energy-free operation, posing significant potential for future industrial applications in personal thermoregulation.
View Article and Find Full Text PDF

The great variety of antimicrobial resistance (AMR) profiles among tuberculosis (TB) patients necessitates a comprehensive detection method. This study developed culture-independent, long amplicon-based targeted next-generation sequencing (tNGS) methods for predicting AMR across 16 drugs within the complex (MTBC). Multiplex PCR amplification was employed to enrich 20 gene regions, with sequencing performed on either the Oxford Nanopore Technologies (ONT) or Illumina platforms.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!