External memory BWT and LCP computation for sequence collections with applications.

Algorithms Mol Biol

4Institute of Computing, University of Campinas, Av. Albert Einstein, 1251, 13083-852 Campinas, Brazil.

Published: March 2019

Background: Sequencing technologies produce larger and larger collections of biosequences that have to be stored in compressed indices supporting fast search operations. Many compressed indices are based on the Burrows-Wheeler Transform (BWT) and the longest common prefix (LCP) array. Because of the sheer size of the input it is important to build these data structures in external memory and time using in the best possible way the available RAM.

Results: We propose a space-efficient algorithm to compute the BWT and LCP array for a collection of sequences in the external or semi-external memory setting. Our algorithm splits the input collection into subcollections sufficiently small that it can compute their BWT in RAM using an optimal linear time algorithm. Next, it merges the partial BWTs in external or semi-external memory and in the process it also computes the LCP values. Our algorithm can be modified to output two additional arrays that, combined with the BWT and LCP array, provide simple, scan-based, external memory algorithms for three well known problems in bioinformatics: the computation of maximal repeats, the all pairs suffix-prefix overlaps, and the construction of succinct de Bruijn graphs.

Conclusions: We prove that our algorithm performs sequential I/Os, where is the total length of the collection and is the maximum LCP value. The experimental results show that our algorithm is only slightly slower than the state of the art for short sequences but it is up to 40 times faster for longer sequences or when the available RAM is at least equal to the size of the input.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6408864PMC
http://dx.doi.org/10.1186/s13015-019-0140-0DOI Listing

Publication Analysis

Top Keywords

external memory
12
bwt lcp
12
lcp array
12
compressed indices
8
size input
8
compute bwt
8
external semi-external
8
semi-external memory
8
lcp
6
algorithm
6

Similar Publications

Sleep Disturbance and Postconcussive Symptoms in Pediatric Mild Traumatic Brain Injury and Orthopedic Injury.

J Head Trauma Rehabil

September 2024

Author Affiliations: Department of Psychology, University of Calgary, Calgary, Alberta (Ms Luszawski and Dr Yeates); Alberta Children's Hospital Research Institute, Calgary, Alberta (Ms Luszawski and Dr Yeates); Hotchkiss Brain Institute, University of Calgary, Calgary, Alberta (Ms Luszawski and Dr Yeates); Department of Pediatrics, Case Western Reserve University, Cleveland, Ohio (Ms Minich, Dr Bacevice, and Dr Bangert); Rainbow Babies and Children's Hospital, University Hospitals Cleveland Medical Center, Cleveland, Ohio (Ms Minich and Dr Bacevice); Department of Psychology and Neuroscience, Brigham Young University, Provo, Utah and Departments of Neurology and Psychiatry, University of Utah, Salt Lake City, Utah (Dr Bigler); Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, Ohio (Dr Taylor); Department of Pediatrics, The Ohio State University, Columbus, Ohio (Drs Taylor, Cohen, and Zumberge); Emergency Medicine, Nationwide Children's Hospital, Columbus, Ohio (Dr Cohen); Department of Radiology, University Hospitals of Cleveland, Cleveland, Ohio (Dr Bangert); Radiology, Nationwide Children's Hospital, Columbus, Ohio (Dr Zumberge); Educational and Counselling Psychology, University of British Columbia, Vancouver, British Columbia (Dr Tomfohr-Madsen); Neurosciences Program, Alberta Children's Hospital, Calgary, Alberta (Dr Brooks); and Departments of Pediatrics, Clinical Neurosciences, and Psychology, University of Calgary, Calgary, Alberta (Dr Brooks).

Objective: Sleep disturbance (SD) is common after pediatric mild traumatic brain injury (mTBI) and may predict increased postconcussive symptoms (PCS) and prolonged recovery. Our objective was to investigate the relation of SD with PCS in children with mTBI and those with orthopedic injury (OI).

Setting: Emergency departments (EDs) at 2 children's hospitals in the Midwestern United States.

View Article and Find Full Text PDF

Alzheimer's disease (AD), a prevalent neurodegenerative disorder, is characterized by mitochondrial dysfunction and immune dysregulation. This study is aimed at developing a risk prediction model for AD by integrating multi-omics data and exploring the interplay between mitochondrial energy metabolism-related genes (MEMRGs) and immune cell dynamics. We integrated four GEO datasets (GSE132903, GSE29378, GSE33000, GSE5281) for differential gene expression analysis, functional enrichment, and weighted gene co-expression network analysis (WGCNA).

View Article and Find Full Text PDF

Reliability of direct-to-home teleneuropsychological assessment: a within-subject design study.

Clin Neuropsychol

January 2025

Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Turkey.

During the COVID-19 pandemic, the need to continue diagnosis and treatment processes, in addition to scientific research, led to a rapid shift towards direct-to-home tele-neuropsychology administrations, the reliability and validity of which had not been clearly established then. This study, therefore, aimed to examine the reliability of direct-to-home tele-neuropsychological assessment (TNP). The sample included 105 cognitively healthy individuals aged between 50-83 years, and 47 patients diagnosed with neurocognitive disorders (mild cognitive impairment and early-stage Alzheimer's type dementia).

View Article and Find Full Text PDF

Focusing Attention to Improve Throwing Skills in Children With Autism Spectrum Disorder: Exploring the Influence of Working Memory.

Occup Ther Int

January 2025

Department of Motor Behavior, Faculty of Physical Education and Sport Sciences, Razi University, Kermanshah, Iran.

This study is aimed at investigating the impact of internal and external attention focus on learning a throwing skill in children with autism, as well as the relationship between working memory and learning rate. Twenty-four children aged 6-8 years with autism were assigned to internal and external attention groups. Participants performed a throwing task while their working memory was assessed using Cornoldi's working memory test.

View Article and Find Full Text PDF

Non-colorectal Cancer Screening and Vaccinations in Patients with Inflammatory Bowel Disease: Expert Review.

Clin Gastroenterol Hepatol

January 2025

Inflammatory Bowel Disease Center, Division of Gastroenterology and Hepatology, Mayo Clinic, Jacksonville FL. Electronic address:

Description: The aim of this American Gastroenterological Association (AGA) Clinical Practice Update (CPU) is to provide best practice advice (BPA) statements for gastroenterologists and other health care providers who provide care to patients with inflammatory bowel disease (IBD). The focus is on IBD-specific screenings (excluding colorectal cancer screening, which is discussed separately) and vaccinations. We provide guidance to ensure that patients are up to date with the disease-specific cancer screenings, vaccinations, as well as advice for mental health and general wellbeing.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!