PFP-FM: An Accelerated FM-index.

Res Sq

Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada.

Published: October 2023

FM-indexes are a crucial data structure in DNA alignment, but searching with them usually takes at least one random access per character in the query pattern. Ferragina and Fischer [1] observed in 2007 that word-based indexes often use fewer random accesses than character-based indexes, and thus support faster searches. Since DNA lacks natural word-boundaries, however, it is necessary to parse it somehow before applying word-based FM-indexing. Last year, Deng et al. [2] proposed parsing genomic data by induced suffix sorting, and showed the resulting word-based FM-indexes support faster counting queries than standard FM-indexes when patterns are a few thousand characters or longer. In this paper we show that using prefix-free parsing-which takes parameters that let us tune the average length of the phrases-instead of induced suffix sorting, gives a significant speedup for patterns of only a few hundred characters. We implement our method and demonstrate it is between 3 and 18 times faster than competing methods on queries to GRCh38, and is consistently faster on queries made to 25,000, 50,000 and 100,000 SARS-CoV-2 genomes. Hence, it seems our method accelerates the performance of count over all state-of-the-art methods with a minor increase in the memory. The source code for PFP-FM is available at https://github.com/marco-oliva/afm.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635359PMC
http://dx.doi.org/10.21203/rs.3.rs-3487536/v1DOI Listing

Publication Analysis

Top Keywords

support faster
8
induced suffix
8
suffix sorting
8
patterns characters
8
pfp-fm accelerated
4
accelerated fm-index
4
fm-index fm-indexes
4
fm-indexes crucial
4
crucial data
4
data structure
4

Similar Publications

Background: Sepsis, a critical global health challenge, accounted for approximately 20% of worldwide deaths in 2017. Although the Sequential Organ Failure Assessment (SOFA) score standardizes the diagnosis of organ dysfunction, early sepsis detection remains challenging due to its insidious symptoms. Current diagnostic methods, including clinical assessments and laboratory tests, frequently lack the speed and specificity needed for timely intervention, particularly in vulnerable populations such as older adults, intensive care unit (ICU) patients, and those with compromised immune systems.

View Article and Find Full Text PDF

This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy.

View Article and Find Full Text PDF

Background Due to its affordability and ease of application, the subarachnoid block is the most frequently used method for lower abdominal procedures. Levobupivacaine has an onset of sensory and motor blockade comparable to that of bupivacaine and prolongs the duration of analgesia while facilitating quick recovery from motor block. Fentanyl and dexmedetomidine, when used as additives to intrathecal local anesthetic, can extend the duration of sensory and motor blockade and enhance postoperative analgesia.

View Article and Find Full Text PDF

Persimmon (Diospyros kaki L.) leaves are a traditional medicinal herb used for treating many infectious and inflammatory-related conditions, including wound healing. To validate its traditional use, our study evaluates the acute toxicity and wound-healing effects of methanolic extracts of Persimmon (Diospyros kaki L.

View Article and Find Full Text PDF

Background: For patients with small-size colorectal liver metastases, growing evidence suggests thermal ablation to be associated with fewer adverse events and faster recovery than resection while also challenging resection in terms of local control and overall survival. This study assessed the potential non-inferiority of thermal ablation compared with surgical resection in patients with small-size resectable colorectal liver metastases.

Methods: Adult patients (aged ≥18 years) from 14 centres in the Netherlands, Belgium, and Italy with ten or fewer small-size (≤3 cm) colorectal liver metastases, no extrahepatic metastases, and an Eastern Cooperative Oncology Group performance status of 0-2, were stratified per centre, and according to their disease burden, into low, intermediate, and high disease burden subgroups and randomly assigned 1:1 to receive either thermal ablation (experimental group) or surgical resection (control group) of all target colorectal liver metastases using the web-based module Castor electronic data capture with variable block sizes of 4, 6, and 8.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!