Publications by Umberto Ferraro Petrillo

Publications by authors named "Umberto Ferraro Petrillo"

Page 1 of 1

Data-Driven Thyroglobulin Cutoffs for Low- and Intermediate-Risk Thyroid Cancer Follow-Up: ITCO Real-World Analysis.

Giorgio Grani Silvia D'Elia Efisio Puxeddu Silvia Morelli Emanuela Arvat Umberto Ferraro Petrillo

J Clin Endocrinol Metab

August 2024

Context: The utility of thyroglobulin (Tg) in the follow-up of differentiated thyroid cancer (DTC) patients has been well-documented. Although third-generation immunoassays have improved accuracy, limitations persist (interfering anti-Tg antibodies and measurement variability). Evolving treatment strategies require a reevaluation of Tg thresholds for optimal patient management.

View Article and Find Full Text PDF

Ten quick tips for bioinformatics analyses using an Apache Spark distributed computing environment.

Davide Chicco Umberto Ferraro Petrillo Giuseppe Cattaneo

PLoS Comput Biol

July 2023

Some scientific studies involve huge amounts of bioinformatics data that cannot be analyzed on personal computers usually employed by researchers for day-to-day activities but rather necessitate effective computational infrastructures that can work in a distributed way. For this purpose, distributed computing systems have become useful tools to analyze large amounts of bioinformatics data and to generate relevant results on virtual environments, where software can be executed for hours or even days without affecting the personal computer or laptop of a researcher. Even if distributed computing resources have become pivotal in multiple bioinformatics laboratories, often researchers and students use them in the wrong ways, making mistakes that can cause the distributed computers to underperform or that can even generate wrong outcomes.

View Article and Find Full Text PDF

DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks.

Lorenzo Di Rocco Umberto Ferraro Petrillo Simona E Rombo

BMC Bioinformatics

November 2022

Background: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still remain unsolved when the problem turns on a large scale.

Results: We propose DIAMIN, that is, a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks.

View Article and Find Full Text PDF

Correction to: FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy.

Umberto Ferraro Petrillo Francesco Palini Giuseppe Cattaneo Raffaele Giancarlo

BMC Bioinformatics

February 2022

View Article and Find Full Text PDF

The power of word-frequency-based alignment-free functions: a comprehensive large-scale experimental analysis.

Giuseppe Cattaneo Umberto Ferraro Petrillo Raffaele Giancarlo Francesco Palini Chiara Romualdi

Bioinformatics

January 2022

Motivation: Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.

View Article and Find Full Text PDF

Minimal Extrathyroidal Extension in Predicting 1-Year Outcomes: A Longitudinal Multicenter Study of Low-to-Intermediate-Risk Papillary Thyroid Carcinoma (ITCO#4).

Raffaella Forleo Giorgio Grani Marco Alfò Valentina Zilioli Raffaele Giubbini Umberto Ferraro Petrillo

Thyroid

December 2021

The role of minimal extrathyroidal extension (mETE) as a risk factor for persistent papillary thyroid carcinoma (PTC) is still debated. The aims of this study were to assess the clinical impact of mETE as a predictor of worse initial treatment response in PTC patients and to verify the impact of radioiodine therapy after surgery in patients with mETE. We reviewed all records in the Italian Thyroid Cancer Observatory database and selected 2237 consecutive patients with PTC who satisfied the inclusion criteria (PTC with no lymph node metastases and at least 1 year of follow-up).

View Article and Find Full Text PDF

FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy.

Umberto Ferraro Petrillo Francesco Palini Giuseppe Cattaneo Raffaele Giancarlo

BMC Bioinformatics

March 2021

Background: Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop.

View Article and Find Full Text PDF

Alignment-free Genomic Analysis via a Big Data Spark Platform.

Umberto Ferraro Petrillo Francesco Palini Giuseppe Cattaneo Raffaele Giancarlo

Bioinformatics

July 2021

Motivation: Alignment-free distance and similarity functions (AF functions, for short) are a well-established alternative to pairwise and multiple sequence alignments for many genomic, metagenomic and epigenomic tasks. Due to data-intensive applications, the computation of AF functions is a Big Data problem, with the recent literature indicating that the development of fast and scalable algorithms computing AF functions is a high-priority task. Somewhat surprisingly, despite the increasing popularity of Big Data technologies in computational biology, the development of a Big Data platform for those tasks has not been pursued, possibly due to its complexity.

View Article and Find Full Text PDF

Rank-Similarity Measures for Comparing Gene Prioritizations: A Case Study in Autism.

Concettina Guerra Sarang Joshi Yinquan Lu Francesco Palini Umberto Ferraro Petrillo

J Comput Biol

March 2021

We discuss the challenge of comparing three gene prioritization methods: network propagation, integer linear programming rank aggregation (RA), and statistical RA. These methods are based on different biological categories and estimate disease-gene association. Previously proposed comparison schemes are based on three measures of performance: receiver operating curve, area under the curve, and median rank ratio.

View Article and Find Full Text PDF

Real-World Performance of the American Thyroid Association Risk Estimates in Predicting 1-Year Differentiated Thyroid Cancer Outcomes: A Prospective Multicenter Study of 2000 Patients.

Giorgio Grani Maria Chiara Zatelli Marco Alfò Teresa Montesano Massimo Torlontano Umberto Ferraro Petrillo

Thyroid

February 2021

Article Synopsis

The study assesses the effectiveness of the 2015 American Thyroid Association (ATA) risk stratification system in predicting outcomes for patients with differentiated thyroid cancer (DTC) one year post-treatment.
It involved a review of data from 2,071 patients across 40 treatment centers, classifying risk levels as low, intermediate, or high based on the ATA guidelines.
Findings indicated that the initial ATA risk classification was a strong predictor of persistent disease, with the center where treatment occurred having little impact on these predictions.

View Article and Find Full Text PDF

Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics.

Umberto Ferraro Petrillo Mara Sorella Giuseppe Cattaneo Raffaele Giancarlo Simona E Rombo

BMC Bioinformatics

April 2019

Background: Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g.

View Article and Find Full Text PDF

Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms.

Umberto Ferraro Petrillo Gianluca Roscigno Giuseppe Cattaneo Raffaele Giancarlo

Bioinformatics

June 2018

Motivation: Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e.

View Article and Find Full Text PDF

FASTdoop: a versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications.

Umberto Ferraro Petrillo Gianluca Roscigno Giuseppe Cattaneo Raffaele Giancarlo

Bioinformatics

May 2017

Summary: MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters.

View Article and Find Full Text PDF