Fast gap-affine pairwise alignment using the wavefront algorithm.

Bioinformatics

Departament d'Arquitectura de Computadors i Sistemes Operatius, Universitat Autònoma de Barcelona, Barcelona 08193, Spain.

Published: May 2021

Motivation: Pairwise alignment of sequences is a fundamental method in modern molecular biology, implemented within multiple bioinformatics tools and libraries. Current advances in sequencing technologies press for the development of faster pairwise alignment algorithms that can scale with increasing read lengths and production yields.

Results: In this article, we present the wavefront alignment algorithm (WFA), an exact gap-affine algorithm that takes advantage of homologous regions between the sequences to accelerate the alignment process. As opposed to traditional dynamic programming algorithms that run in quadratic time, the WFA runs in time O(ns), proportional to the read length n and the alignment score s, using O(s2) memory. Furthermore, our algorithm exhibits simple data dependencies that can be easily vectorized, even by the automatic features of modern compilers, for different architectures, without the need to adapt the code. We evaluate the performance of our algorithm, together with other state-of-the-art implementations. As a result, we demonstrate that the WFA runs 20-300× faster than other methods aligning short Illumina-like sequences, and 10-100× faster using long noisy reads like those produced by Oxford Nanopore Technologies.

Availability And Implementation: The WFA algorithm is implemented within the wavefront-aligner library, and it is publicly available at https://github.com/smarco/WFA.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8355039PMC
http://dx.doi.org/10.1093/bioinformatics/btaa777DOI Listing

Publication Analysis

Top Keywords

pairwise alignment
12
wfa runs
8
alignment
6
algorithm
6
fast gap-affine
4
gap-affine pairwise
4
alignment wavefront
4
wavefront algorithm
4
algorithm motivation
4
motivation pairwise
4

Similar Publications

Detection of genetic variability in dairy cattle infectivity for bovine tuberculosis.

J Dairy Sci

January 2025

The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, United Kingdom. Electronic address:

This study investigated the genetics of bovine tuberculosis (bTB) infectivity in Holstein-Friesian dairy cows using British national data. The analyses included cows with recorded sires from herds affected by bTB outbreaks between 2000 and 2022. Animals were considered bTB-positive if they reacted positively to the skin test and/or had positive post-mortem findings.

View Article and Find Full Text PDF

With their diverse species, mosquitoes are known to transmit the causal agents of diseases such as malaria, dengue, and yellow fever. Their high adaptability, attraction to humans, and variable adult behaviors make them a significant health concern. The focus on Aedes aegypti is significant for reducing vector-human contacts, monitoring insecticide resistance, and developing innovative vector management strategies.

View Article and Find Full Text PDF

Introduction: This study aimed to evaluate the stability of palatal rugae patterns after slow maxillary expansion (SME) treatment and the reliability of the rugae region as a reference region in digital superimposition.

Methods: The SME group comprised 21 subjects with Angle Class I or Class II dental malocclusion with unilateral or bilateral crossbite and constricted maxilla and were selected before the pubertal peak. Intraoral scans were captured via the intraoral scanner iTero Element software (version 1.

View Article and Find Full Text PDF

Multiple sequence alignment (MSA) has evolved into a fundamental tool in the biological sciences, playing a pivotal role in predicting molecular structures and functions. With broad applications in protein and nucleic acid modeling, MSAs continue to underpin advancements across a range of disciplines. MSAs are not only foundational for traditional sequence comparison techniques but also increasingly important in the context of artificial intelligence (AI)-driven advancements.

View Article and Find Full Text PDF

Tuberculosis epidemics have traditionally been conceptualized as arising from a single uniform pathogen. However, -complex (Mtbc), the pathogen causing tuberculosis in humans, encompasses multiple lineages exhibiting genetic and phenotypic diversity that may be responsible for heterogeneity in TB transmission. We analysed a population-based dataset of 1,354 Mtbc whole-genome sequences collected over four years in Botswana, a country with high HIV and tuberculosis burden.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!