Publications by Enrico Seiler

Publications by authors named "Enrico Seiler"

Page 1 of 1

Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries.

Svenja Mehringer Enrico Seiler Felix Droop Mitra Darvish René Rahn

Genome Biol

May 2023

We present a novel data structure for searching sequences in large databases: the Hierarchical Interleaved Bloom Filter (HIBF). It is extremely fast and space efficient, yet so general that it could serve as the underlying engine for many applications. We show that the HIBF is superior in build time, index size, and search time while achieving a comparable or better accuracy compared to other state-of-the-art tools.

View Article and Find Full Text PDF

Needle: a fast and space-efficient prefilter for estimating the quantification of very large collections of expression experiments.

Mitra Darvish Enrico Seiler Svenja Mehringer René Rahn Knut Reinert

Bioinformatics

September 2022

Motivation: The ever-growing size of sequencing data is a major bottleneck in bioinformatics as the advances of hardware development cannot keep up with the data growth. Therefore, an enormous amount of data is collected but rarely ever reused, because it is nearly impossible to find meaningful experiments in the stream of raw data.

Results: As a solution, we propose Needle, a fast and space-efficient index which can be built for thousands of experiments in <2 h and can estimate the quantification of a transcript in these experiments in seconds, thereby outperforming its competitors.

View Article and Find Full Text PDF

Critical Assessment of Metagenome Interpretation: the second round of challenges.

Fernando Meyer Adrian Fritz Zhi-Luo Deng David Koslicki Till Robin Lesker Enrico Seiler

Nat Methods

April 2022

Article Synopsis

Evaluating metagenomic software is crucial for enhancing the interpretation of metagenomes, and the CAMI II challenge focused on this by using complex datasets from numerous genomes and plasmids.
The analysis of 5,002 results from 76 software versions showed significant advancements in assembly, especially with long-read data, although challenges remained with related strains and genome recovery.
Findings indicated that while taxon profilers improved, they struggled with viruses and Archaea, highlighting the need for better reproducibility in clinical pathogen detection and guiding researchers in method selection based on efficiency and performance metrics.

View Article and Find Full Text PDF

Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences.

Enrico Seiler Svenja Mehringer Mitra Darvish Etienne Turc Knut Reinert

iScience

July 2021

We present Raptor, a system for approximately searching many queries such as next-generation sequencing reads or transcripts in large collections of nucleotide sequences. Raptor uses winnowing minimizers to define a set of representative -mers, an extension of the interleaved Bloom filters (IBFs) as a set membership data structure and probabilistic thresholding for minimizers. Our approach allows compression and partitioning of the IBF to enable the effective use of secondary memory.

View Article and Find Full Text PDF

ganon: precise metagenomics classification against large and up-to-date sets of reference sequences.

Vitor C Piro Temesgen H Dadi Enrico Seiler Knut Reinert Bernhard Y Renard

Bioinformatics

July 2020

Motivation: The exponential growth of assembled genome sequences greatly benefits metagenomics studies. However, currently available methods struggle to manage the increasing amount of sequences and their frequent updates. Indexing the current RefSeq can take days and hundreds of GB of memory on large servers.

View Article and Find Full Text PDF

Where did you come from, where did you go: Refining metagenomic analysis tools for horizontal gene transfer characterisation.

Enrico Seiler Kathrin Trappe Bernhard Y Renard

PLoS Comput Biol

July 2019

Horizontal gene transfer (HGT) has changed the way we regard evolution. Instead of waiting for the next generation to establish new traits, especially bacteria are able to take a shortcut via HGT that enables them to pass on genes from one individual to another, even across species boundaries. The tool Daisy offers the first HGT detection approach based on read mapping that provides complementary evidence compared to existing methods.

View Article and Find Full Text PDF

DREAM-Yara: an exact read mapper for very large databases with short update time.

Temesgen Hailemariam Dadi Enrico Siragusa Vitor C Piro Andreas Andrusch Enrico Seiler

Bioinformatics

September 2018

Motivation: Mapping-based approaches have become limited in their application to very large sets of references since computing an FM-index for very large databases (e.g. >10 GB) has become a bottleneck.

View Article and Find Full Text PDF