We develop an approach for microarray differential expression analysis, i.e. identifying genes whose expression levels differ between two or more groups. Current approaches to inference rely either on full parametric assumptions or on permutation-based techniques for sampling under the null distribution. In some situations, however, a full parametric model cannot be justified, or the sample size per group is too small for permutation methods to be valid. We propose a semi-parametric framework based on partial mixture estimation which only requires a parametric assumption for the null (equally expressed) distribution and can handle small sample sizes where permutation methods break down. We develop two novel improvements of Scott's minimum integrated square error criterion for partial mixture estimation [Scott, 2004a,b]. As a side benefit, we obtain interpretable and closed-form estimates for the proportion of EE genes. Pseudo-Bayesian and frequentist procedures for controlling the false discovery rate are given. Results from simulations and real datasets indicate that our approach can provide substantial advantages for small sample sizes over the SAM method of Tusher et al. [2001], the empirical Bayes procedure of Efron and Tibshirani [2002], the mixture of normals of Pan et al. [2003] and a t-test with p-value adjustment [Dudoit et al., 2003] to control the FDR [Benjamini and Hochberg, 1995].

Download full-text PDF

Source
http://dx.doi.org/10.2202/1544-6115.1333DOI Listing

Publication Analysis

Top Keywords

partial mixture
12
mixture estimation
12
differential expression
8
expression analysis
8
full parametric
8
permutation methods
8
small sample
8
sample sizes
8
semi-parametric differential
4
analysis partial
4

Similar Publications

Machine learning analysis of rivaroxaban solubility in mixed solvents for application in pharmaceutical crystallization.

Sci Rep

January 2025

Department of Oral & Maxillofacial Surgery and Diagnostic Sciences, Faculty of Dentistry, Taif University, 21944, Taif, Saudi Arabia.

This study investigates the use of machine learning models to predict solubility of rivaroxaban in binary solvents based on temperature (T), mass fraction (w), and solvent type. Using a dataset with over 250 data points and including solvents encoded with one-hot encoding, four models were compared: Gradient Boosting (GB), Light Gradient Boosting (LGB), Extra Trees (ET), and Random Forest (RF). The Jellyfish Optimizer (JO) algorithm was applied to tune hyperparameters, enhancing model performance.

View Article and Find Full Text PDF

Robust near-infrared modeling for pharmaceutical powder streams: External variable augmented iterative optimization technology (EVA-IOT).

Eur J Pharm Biopharm

January 2025

Duquesne University Graduate School for Pharmaceutical Sciences, Pittsburgh, PA 15282, United States; Duquesne Center for Pharmaceutical Technology, Duquesne University, Pittsburgh, PA 15282, United States. Electronic address:

The adoption of pure component models, such as iterative optimization technology (IOT) algorithms, is gaining significant interest in the pharmaceutical industry, primarily because of their calibration-free/minimal calibration requirements for process analytical technology applications. The IOT methods have recently demonstrated great potential for monitoring the quality of continuous powder mixtures by Near-infrared (NIR) spectroscopy. However, the dynamic conditions of continuous manufacturing processes may limit the effectiveness of such approaches.

View Article and Find Full Text PDF

The biodegradation of organic aromatic compounds in subsurface environments is often hindered by limited dissolved oxygen. While oxygen supplementation can enhance in situ biodegradation, it poses financial and technical challenges. This study explores introducing low-oxygen concentrations in anaerobic environments for efficient contaminant removal, particularly in scenarios where coexisting pollutants are present.

View Article and Find Full Text PDF

Lidocaine poses challenges when it comes to direct spectrophotometric measurement due to the lack of sharp peak within its spectra in zero-order. This lack of a distinct peak makes it difficult to accurately quantify lidocaine using traditional direct spectrophotometric methods. In our study, different univariate and multivariate spectrophotometric techniques have been established and their validity has been assessed for the determination of the mixture of Lidocaine HCl (LD), Oxytetracycline HCl (OTC) together with LD carcinogenic impurity [2,6- dimethylaniline] DMA.

View Article and Find Full Text PDF

Low-carbon fuels, emitting less carbon than fossil fuels, are proposed to help in the transition to a sustainable, decarbonized transport sector. The new biofuels being studied and developed in this context include hydrotreated vegetable oils (HVO). Its chemical composition, which is the same as fossil diesel (primarily composed of linear chain hydrocarbons C12-C24), makes HVO (more homogeneous mixtures of paraffinic hydrocarbons C10-C20, containing no sulfur or aromatics) a fuel with slightly lower density than fossil diesel due to these characteristics.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!