Constrained Standardization of Count Data from Massive Parallel Sequencing.

J Mol Biol

Universiteit Hasselt, Data Science Institute (DSI), Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Agoralaan, Diepenbeek BE 3590, Belgium; Universiteit Antwerpen, Centre for Proteomics, Groenenborgerlaan 171, Antwerpen BE 2020, Belgium. Electronic address:

Published: May 2021

In high-throughput omics disciplines like transcriptomics, researchers face a need to assess the quality of an experiment prior to an in-depth statistical analysis. To efficiently analyze such voluminous collections of data, researchers need triage methods that are both quick and easy to use. Such a normalization method for relative quantitation, CONSTANd, was recently introduced for isobarically-labeled mass spectra in proteomics. It transforms the data matrix of abundances through an iterative, convergent process enforcing three constraints: (I) identical column sums; (II) each row sum is fixed (across matrices) and (III) identical to all other row sums. In this study, we investigate whether CONSTANd is suitable for count data from massively parallel sequencing, by qualitatively comparing its results to those of DESeq2. Further, we propose an adjustment of the method so that it may be applied to identically balanced but differently sized experiments for joint analysis. We find that CONSTANd can process large data sets at well over 1 million count records per second whilst mitigating unwanted systematic bias and thus quickly uncovering the underlying biological structure when combined with a PCA plot or hierarchical clustering. Moreover, it allows joint analysis of data sets obtained from different batches, with different protocols and from different labs but without exploiting information from the experimental setup other than the delineation of samples into identically processed sets (IPSs). CONSTANd's simplicity and applicability to proteomics as well as transcriptomics data make it an interesting candidate for integration in multi-omics workflows.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jmb.2021.166966DOI Listing

Publication Analysis

Top Keywords

count data
8
parallel sequencing
8
joint analysis
8
data sets
8
data
7
constrained standardization
4
standardization count
4
data massive
4
massive parallel
4
sequencing high-throughput
4

Similar Publications

Bibliometric analysis of global research trends in vestibular neuritis (1980-2024).

Eur Arch Otorhinolaryngol

January 2025

Faculty of Applied Sciences, Department of Accounting and Financial Management, Necmettin Erbakan University, Konya, Turkey.

Purpose: Vestibular neuritis (VN) is a common cause of vertigo with significant impact on patients' quality of life. This study aimed to analyze global research trends in VN using bibliometric methods to identify key themes, influential authors, institutions, and countries contributing to the field.

Methods: We conducted a comprehensive search of the Web of Science Core Collection database for publications related to VN from 1980 to 2024.

View Article and Find Full Text PDF

Background: Individuals with metabolic syndrome (MetS) are at a higher risk of developing depressive symptoms, with inflammation hypothesized to mediate this association. This study used data from the National Health and Nutrition Examination Survey (NHANES) (2015-2020) to investigate the relationship between MetS and depression and assess the mediating role of inflammatory markers.

Methods: This cross-sectional study included 20,520 participants.

View Article and Find Full Text PDF

Coping after the COVID-19 pandemic: nurses' learning intent and implications for the workforce and education.

Br J Nurs

January 2025

Professor, Department of Nursing, Beaver College of Health Sciences, Appalachian State University, Boone, North Carolina, USA.

Background/aim: Addressing the critical global shortage of nurses requires an understanding of how a global pandemic reshaped nurses' motivations and intentions toward education. This study aimed to describe COVID-19's impact on nurses' intent to pursue additional education.

Method: This descriptive study, based in North Carolina in the USA, used content analysis with an inductive approach to examine the responses of nurses to one open-ended question in a large quantitative workforce survey: how has COVID-19 influenced your plans for future education? Responses were coded with counts and organised into themes and subthemes.

View Article and Find Full Text PDF

Background: The World Health Organization conditionally recommends reactive drug administration to reduce malaria transmission in settings approaching elimination. However, few studies have evaluated the impact of reactive focal drug administration (rFDA) in sub-Saharan Africa, and none have evaluated it under programmatic conditions. In 2016, Senegal's national malaria control programme introduced rFDA, the presumptive treatment of compound members of a person with confirmed malaria, and reactive mass focal drug administration (rMFDA), an expanded effort including neighbouring compounds during an outbreak, in 10 low transmission districts in the north of the country.

View Article and Find Full Text PDF

A National Synthetic Populations Dataset for the United States.

Sci Data

January 2025

RTI International, 3040 Cornwallis Rd., P.O. Box 12194, Research Triangle Park, NC, 27709, USA.

Geospatially explicit and statistically accurate person and household data allow researchers to study community-and neighborhood-level effects and design and test hypotheses that would otherwise not be possible without the generation of synthetic data. In this article, we demonstrate the workflow for generating spatially explicit household- and individual-level synthetic populations for the United States representing the year 2019. We use publicly available U.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!