Automating data analysis pipelines is a key requirement to ensure reproducibility of results, especially when dealing with large volumes of data. Here we assembled automated pipelines for the analysis of High-throughput Sequencing (HTS) data originating from RNA-Seq, ChIP-Seq and Germline variant calling experiments. We implemented these workflows in Common workflow language (CWL) and evaluated their performance by: i) reproducing the results of two previously published studies on Chronic Lymphocytic Leukemia (CLL), and ii) analyzing whole genome sequencing data from four Genome in a Bottle Consortium (GIAB) samples, comparing the detected variants against their respective golden standard truth sets. We demonstrated that CWL-implemented workflows clearly achieved high accuracy in reproducing previously published results, discovering significant biomarkers and detecting germline SNP and small INDEL variants. CWL pipelines are characterized by reproducibility and reusability; combined with containerization, they provide the ability to overcome issues of software incompatibility and laborious configuration requirements. In addition, they are flexible and can be used immediately or adapted to the specific needs of an experiment or study. The CWL-based workflows developed in this study, along with version information for all software tools, are publicly available on GitHub (https://github.com/BiodataAnalysisGroup/CWL_HTS_pipelines) under the MIT License. They are suitable for the analysis of short-read (such as Illumina-based) data and constitute an open resource that can facilitate automation, reproducibility and cross-platform compatibility for standard bioinformatic analyses.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10662043 | PMC |
http://dx.doi.org/10.3389/fbinf.2023.1275593 | DOI Listing |
Phytomedicine
January 2025
Department of Clinical Pharmacy, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China. Electronic address:
Background: Although recent progress provides mechanistic insights into diabetic nephropathy (DN), effective treatments remain scarce. DN, characterized by proteinuria and a progressive decline in renal function, primarily arises from podocyte injury, which impairs the glomerular filtration barrier. Wogonoside, a bioactive compound from the traditional Chinese herb Scutellaria baicalensis, has not been explored for its role in DN.
View Article and Find Full Text PDFInt Immunopharmacol
January 2025
Department of Urology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong 510515, PR China. Electronic address:
Background: Bladder cancer (BCa), particularly muscle-invasive bladder cancer (MIBC), is associated with poor prognosis, partly because of immune evasion driven by M2 tumor-associated macrophages (TAMs). Understanding the regulatory mechanisms of M2 macrophage polarization via PRKN-mediated mitophagy and histone lactylation (H3K18la) is crucial for improving treatment strategies.
Methods: A single-cell atlas from 46 human BCa samples was constructed to identify macrophage subpopulations.
Epilepsy Res
January 2025
Institute of Neurobiology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, 76 West Yanta Road, Xi'an City 710061, China; Institute of Neuroscience, Translational Medicine Institute, Xi'an Jiaotong University Health Science Center, 76 West Yanta Road, Xi'an City 710061, China. Electronic address:
Mutations in methyl CpG binding protein 2 (MeCP2) are linked to Rett syndrome, in which epilepsy is one of the most well-described disorders. However, little is known about the specific role of MeCP2 during epileptogenesis. Our previous study has demonstrated that MeCP2 has a unique control on the development of mossy fiber sprouting (MFS) in the epileptic hippocampus.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Bioinformatics Division, WEHI, Parkville, VIC 3052, Australia.
edgeR is an R/Bioconductor software package for differential analyses of sequencing data in the form of read counts for genes or genomic features. Over the past 15 years, edgeR has been a popular choice for statistical analysis of data from sequencing technologies such as RNA-seq or ChIP-seq. edgeR pioneered the use of the negative binomial distribution to model read count data with replicates and the use of generalized linear models to analyze complex experimental designs.
View Article and Find Full Text PDFCommun Biol
January 2025
Division of Biological Science, Graduate School of Science and Technology, Nara Institute of Science and Technology (NAIST), Ikoma, Japan.
Monocarpic plants flower only once and then produce seeds. Many monocarpic plants require a cold treatment known as vernalization before they flower. This requirement delays flowering until the plant senses warm temperatures in the spring.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!