Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical scenarios. Regular challenges are scalability and extensibility of the analysis software. In this manuscript, we describe OpenStats, a freely available software package that addresses these challenges. We show the performance of the software in a high-throughput phenomic pipeline in the International Mouse Phenotyping Consortium (IMPC) and compare the agreement of the results with the most similar implementation in the literature. OpenStats has significant improvements in speed and scalability compared to existing software packages including a 13-fold improvement in computational time to the current production analysis pipeline in the IMPC. Reduced complexity also promotes FAIR data analysis by providing transparency and benefiting other groups in reproducing and re-usability of the statistical methods and results. OpenStats is freely available under a Creative Commons license at www.bioconductor.org/packages/OpenStats.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773254PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242933PLOS

Publication Analysis

Top Keywords

software package
8
openstats freely
8
software
5
analysis
5
openstats
4
openstats robust
4
robust scalable
4
scalable software
4
package reproducible
4
reproducible analysis
4

Similar Publications

Objective: Carbohydrate antigen 19-9 (CA19-9) and carcinoembryonic antigen (CEA) serve as pivotal tumor markers in colorectal cancer (CRC). However, uncertainty persists regarding the prognostic significance of the two tumor markers when falling within the normal range. We attempt to compare the prognostic differences of tumor markers at different levels within the reference range.

View Article and Find Full Text PDF

ModeHunter is a modular Python software package for the simulation of 3D biophysical motion across spatial resolution scales using modal analysis of elastic networks. It has been curated from our in-house Python scripts over the last 15 years, with a focus on detecting similarities of elastic motion between atomic structures, coarse-grained graphs, and volumetric data obtained from biophysical or biomedical imaging origins, such as electron microscopy or tomography. With ModeHunter, normal modes of biophysical motion can be analyzed with various static visualization techniques or brought to life by dynamics animation in terms of single or multimode trajectories or decoy ensembles.

View Article and Find Full Text PDF

Objectives: To identify cuproptosis- and ferroptosis-related genes involved in nonalcoholic fatty liver disease and to determine the diagnostic value of hub genes.

Methods: The gene expression dataset GSE89632 was retrieved from the Gene Expression Omnibus database to identify differentially expressed genes (DEGs) between the non-alcoholic steatohepatitis (NASH) group and the healthy group using the 'limma' package in R software and weighted gene co-expression network analysis. Gene ontology, kyoto encyclopedia of genes and genomes pathway, and single-sample gene set enrichment analyses were performed to identify functional enrichment of DEGs.

View Article and Find Full Text PDF

Glycosylation is the most common and diverse modification of proteins. It can affect protein function and stability and is associated with many diseases. While proteomic methods to study most post-translational modifications are now quite mature, glycopeptide analysis is still a challenge, particularly from the aspect of data analysis.

View Article and Find Full Text PDF

Proteoform Identification and Quantification Based on Alignment Graphs.

Bioinformatics

January 2025

Department of Computer Science, City University of Hong Kong, Hong Kong, China.

Motivation: Proteoforms are the different forms of a proteins generated from the genome with various sequence variations, splice isoforms, and post-translational modifications. Proteoforms regulate protein structures and functions. A single protein can have multiple proteoforms due to different modification sites.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!