Publications by Gianluca Roscigno

Publications by authors named "Gianluca Roscigno"

Page 1 of 1

Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms.

Umberto Ferraro Petrillo Gianluca Roscigno Giuseppe Cattaneo Raffaele Giancarlo

Bioinformatics

June 2018

Motivation: Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e.

View Article and Find Full Text PDF

FASTdoop: a versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications.

Umberto Ferraro Petrillo Gianluca Roscigno Giuseppe Cattaneo Raffaele Giancarlo

Bioinformatics

May 2017

Summary: MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters.

View Article and Find Full Text PDF