Correcting Illumina data.

Brief Bioinform

Published: July 2015

Next-generation sequencing technologies revolutionized the ways in which genetic information is obtained and have opened the door for many essential applications in biomedical sciences. Hundreds of gigabytes of data are being produced, and all applications are affected by the errors in the data. Many programs have been designed to correct these errors, most of them targeting the data produced by the dominant technology of Illumina. We present a thorough comparison of these programs. Both HiSeq and MiSeq types of Illumina data are analyzed, and correcting performance is evaluated as the gain in depth and breadth of coverage, as given by correct reads and k-mers. Time and memory requirements, scalability and parallelism are considered as well. Practical guidelines are provided for the effective use of these tools. We also evaluate the efficiency of the current state-of-the-art programs for correcting Illumina data and provide research directions for further improvement.

Download full-text PDF	Source
http://dx.doi.org/10.1093/bib/bbu029	DOI Listing

Publication Analysis

Top Keywords

illumina data

correcting illumina

data produced

data

data next-generation

next-generation sequencing

sequencing technologies

technologies revolutionized

revolutionized ways

ways genetic

Similar Publications

Classification of non-TCGA cancer samples to TCGA molecular subtypes using compact feature sets.

Cancer Cell

December 2024

Department of Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA. Electronic address:

Kyle Ellrott Christopher K Wong Christina Yau Mauro A A Castro Jordan A Lee

Molecular subtypes, such as defined by The Cancer Genome Atlas (TCGA), delineate a cancer's underlying biology, bringing hope to inform a patient's prognosis and treatment plan. However, most approaches used in the discovery of subtypes are not suitable for assigning subtype labels to new cancer specimens from other studies or clinical trials. Here, we address this barrier by applying five different machine learning approaches to multi-omic data from 8,791 TCGA tumor samples comprising 106 subtypes from 26 different cancer cohorts to build models based upon small numbers of features that can classify new samples into previously defined TCGA molecular subtypes-a step toward molecular subtype application in the clinic.

View Article and Find Full Text PDF

Similar Publications

Using DNA metabarcoding and direct behavioural observations to identify the diet of proboscis monkeys (Nasalis larvatus) in the Kinabatangan Floodplain, Sabah.

PLoS One

January 2025

Danau Girang Field Centre, c/o Sabah Wildlife Department, Kota Kinabalu, Malaysia.

Valentine Thiry Arthur F Boom Danica J Stark Olivier J Hardy Roseline C Beudels-Jamar

Characterizing the feeding ecology of threatened species is essential to establish appropriate conservation strategies. We focused our study on the proboscis monkey (Nasalis larvatus), an endangered primate species which is endemic to the island of Borneo. Our survey was conducted in the Lower Kinabatangan Wildlife Sanctuary (LKWS), a riverine protected area that is surrounded by oil palm plantations.

View Article and Find Full Text PDF

Similar Publications

Small RNA sequencing analysis provides novel insights into microRNA-mediated regulation of defense responses in chickpea against Fusarium wilt infection.

Planta

January 2025

ICAR-National Institute for Plant Biotechnology, New Delhi, 110012, Delhi, India.

Parichita Priyadarshini Gopal Kalwan Deshika Kohli Deepesh Kumar C Bharadwaj

Small RNA sequencing analysis in two chickpea genotypes, JG 62 (Fusarium wilt-susceptible) and WR 315 (Fusarium wilt-resistant), under Fusarium wilt stress led to identification of 544 miRNAs which included 406 known and 138 novel miRNAs. A total of 115 miRNAs showed differential expression in both the genotypes across different combinations. A miRNA, Car-miR398 targeted copper chaperone for superoxide dismutase (CCS) that, in turn, regulated superoxide dismutase (SOD) activity during chickpea-Foc interaction.

View Article and Find Full Text PDF

Similar Publications

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Yale University, New Haven, CT, USA.

Gita Pathak Robert H Pietrzak Daniel Levey Janitza Montalvo-Ortiz Henry Kranzler

Background: Alcohol Use Disorder (AUD) affects over 15 million individuals in the United States, contributing to oxidative stress, neuroinflammation, and elevating the risk of neurodegeneration. Despite this, the connection between AUD and aging conditions, particularly Alzheimer's disease (AD), remains unclear. AD, with a heritability of 60-80%, is genetically linked, necessitating an exploration of the molecular implications of AUD and genetic susceptibility to AD.

View Article and Find Full Text PDF

Similar Publications

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Alzheimer's disease and other cognitive disorders Unit. Hospital Clínic de Barcelona; FRCB-IDIBAPS; University of Barcelona, Barcelona, Spain.

Aina Comas-Albertí Oscar Ramos-Campoy Laura Fort-Aznar David Hervás-Marín Sergi Borrego-Écija

Background: Epigenetic mechanisms as a potential underlying pathogenic mechanism of neurodegenerative diseases have been the scope of several studies performed so far. However, there is a gap in analyzing different forms of early-onset dementia to minimize the effect of aging and the use of Lymphoblastoid cell lines (LCLs) as a possible disease model for earlier clinical phases.

Method: We performed a genome-wide DNA methylation analysis in 64 samples (from prefrontal cortex and lymphoblastoid cell lines) from Alzheimer's Disease (AD) and Frontotemporal dementia (FTD) using the Illumina Infinium MethylationEPIC V2.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!