Background: Next-generation sequencing (NGS) approaches are commonly used to identify key regulatory networks that drive transcriptional programs. Although these technologies are frequently used in biological studies, NGS data analysis remains a challenging, time-consuming, and often irreproducible process. Therefore, there is a need for a comprehensive and flexible workflow platform that can accelerate data processing and analysis so more time can be spent on functional studies.

Results: We have developed an integrative, stand-alone workflow platform, named CIPHER, for the systematic analysis of several commonly used NGS datasets including ChIP-seq, RNA-seq, MNase-seq, DNase-seq, GRO-seq, and ATAC-seq data. CIPHER implements various open source software packages, in-house scripts, and Docker containers to analyze and process single-ended and pair-ended datasets. CIPHER's pipelines conduct extensive quality and contamination control checks, as well as comprehensive downstream analysis. A typical CIPHER workflow includes: (1) raw sequence evaluation, (2) read trimming and adapter removal, (3) read mapping and quality filtering, (4) visualization track generation, and (5) extensive quality control assessment. Furthermore, CIPHER conducts downstream analysis such as: narrow and broad peak calling, peak annotation, and motif identification for ChIP-seq, differential gene expression analysis for RNA-seq, nucleosome positioning for MNase-seq, DNase hypersensitive site mapping, site annotation and motif identification for DNase-seq, analysis of nascent transcription from Global-Run On (GRO-seq) data, and characterization of chromatin accessibility from ATAC-seq datasets. In addition, CIPHER contains an "analysis" mode that completes complex bioinformatics tasks such as enhancer discovery and provides functions to integrate various datasets together.

Conclusions: Using public and simulated data, we demonstrate that CIPHER is an efficient and comprehensive workflow platform that can analyze several NGS datasets commonly used in genome biology studies. Additionally, CIPHER's integrative "analysis" mode allows researchers to elicit important biological information from the combined dataset analysis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5549294PMC
http://dx.doi.org/10.1186/s12859-017-1770-1DOI Listing

Publication Analysis

Top Keywords

workflow platform
16
analysis
9
next-generation sequencing
8
data analysis
8
ngs datasets
8
extensive quality
8
downstream analysis
8
annotation motif
8
motif identification
8
"analysis" mode
8

Similar Publications

Purpose: Our study explores the utilization of objective tools for preoperative assessment of elderly patients by Emergency General Surgeons (EGS).

Methods: A descriptive cross-sectional survey was conducted via the European Society for Trauma and Emergency Surgery (ESTES) Research Committee. EGS were invited through the ESTES members' mailing list and social media platforms.

View Article and Find Full Text PDF

MetaboLabPy-An Open-Source Software Package for Metabolomics NMR Data Processing and Metabolic Tracer Data Analysis.

Metabolites

January 2025

Department of Metabolism and Systems Sciences, School of Medical Sciences, College of Medicine and Health, University of Birmingham, Birmingham B15 2TT, UK.

NMR spectroscopy is a powerful technique for studying metabolism, either in metabolomics settings or through tracing with stable isotope-enriched metabolic precursors. MetaboLabPy (version 0.9.

View Article and Find Full Text PDF

Spheroids, as three-dimensional (3D) cell aggregates, can be prepared using various methods, including hanging drops, microwells, microfluidics, magnetic manipulation, and bioreactors. However, current spheroid manufacturing techniques face challenges such as complex workflows, the need for specialized personnel, and poor batch reproducibility. In this study, we designed a support-free, 3D-printed microwell chip and developed a compatible low-cell-adhesion process.

View Article and Find Full Text PDF

In much of the northern Great Basin of the western United States, rangelands, and semi-arid ecosystems invaded by exotic annual grasses such as cheatgrass () and medusahead () are experiencing an increasingly short fire cycle, which is compounding and persistent. Improving and expanding ground-based field methods for measuring the above-ground biomass (AGB) may enable more sample collections across a landscape and over succession regimes and better harmonize with other remote sensing techniques. Developments and increased adoption of unoccupied aerial systems (UAS) and instrumentation for vegetation monitoring enable greater understanding of vegetation in many ecosystems.

View Article and Find Full Text PDF

Identifying the compound formulae-related xenobiotics in bio-samples is full of challenges. Conventional strategies always exhibit the insufficiencies in overall coverage, analytical efficiency, and degree of automation, and the results highly rely on the personal knowledge and experience. The goal of this work was to establish a software-aided approach, by integrating ultra-high performance liquid chromatography/ion-mobility quadrupole time-of-flight mass spectrometry (UHPLC/IM-QTOF-MS) and in-house high-definition MS library, to enhance the identification of prototypes and metabolites of the compound formulae , taking Sishen formula (SSF) as a template.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!