Principal components computed via PCA (principal component analysis) are traditionally used to reduce dimensionality in genomic data or to correct for population stratification. In this paper, we explore the penalized eigenvalue problem (PEP) which reformulates the computation of the first eigenvector as an optimization problem and adds an L1 penalty constraint. The contribution of our article is threefold. First, we extend PEP by applying Nesterov smoothing to the original LASSO-type L1 penalty. This allows one to compute analytical gradients which enable faster and more efficient minimization of the objective function associated with the optimization problem. Second, we demonstrate how higher order eigenvectors can be calculated with PEP using established results from singular value decomposition (SVD). Third, using data from the 1000 Genome Project dataset, we empirically demonstrate that our proposed smoothed PEP allows one to increase numerical stability and obtain meaningful eigenvectors. We further investigate the utility of the penalized eigenvector approach over traditional PCA.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10557800PMC

Publication Analysis

Top Keywords

principal component
8
component analysis
8
nesterov smoothing
8
optimization problem
8
penalized principal
4
analysis nesterov
4
smoothing principal
4
principal components
4
components computed
4
computed pca
4

Similar Publications

Surface water chemistry of the River Ganga at Varanasi was analyzed at 10 locations over 3 years (2019-2021) across pre-monsoon, monsoon, and post-monsoon seasons. The study aimed to assess water parameters using principal component analysis (PCA), calculate the water quality index (WQI), determine processes governing water chemistry, evaluate irrigation suitability, and estimate non-carcinogenic health risks. The physical parameters measured included pH (8.

View Article and Find Full Text PDF

Diagnosis of lung cancer using salivary miRNAs expression and clinical characteristics.

BMC Pulm Med

January 2025

Universal Scientific Education and Research Network (USERN), Tehran, Iran.

Objective: Lung cancer (LC), the primary cause for cancer-related death globally is a diverse illness with various characteristics. Saliva is a readily available biofluid and a rich source of miRNA. It can be collected non-invasively as well as transported and stored easily.

View Article and Find Full Text PDF

Optical techniques, such as functional near-infrared spectroscopy (fNIRS), contain high potential for the development of non-invasive wearable systems for evaluating cerebral vascular condition in aging, due to their portability and ability to monitor real-time changes in cerebral hemodynamics. In this study, thirty-six healthy adults were measured by single channel fNIRS to explore differences between two age groups using machine learning (ML). The subjects, measured during functional magnetic resonance imaging (fMRI) at Oulu University Hospital, were divided into young (age ≤ 32) and elderly (age ≥ 57) groups.

View Article and Find Full Text PDF

A systematic study on composition and antioxidant of 6 varieties of highbush blueberries by 3 soil matrixes in China.

Food Chem

January 2025

Engineering Center of Genetic Breeding and Innovative Utilization of Small Fruits of Jilin Province, Changchun, Jilin 130118, China; College of Horticulture, Jilin Agricultural University, Changchun, Jilin 130118, China. Electronic address:

Blueberries are the most popular small berries, in order to solve the problem of unbalanced blueberry resources in different regions of China. In this study, 18 blueberries were analyzed by chromatography and mass spectrometry for 9 soil elements, 6 anthocyanins, 7 phenolic acids, 9 organic acids, and 12 flavonoids. The result showed that blueberry physico-chemical indicators were significantly variable across production regions by Wenn and volcano maps, chlorogenic acid, ascorbic acid, citric acid, catechin were the main antioxidant active components, soil pH was significantly correlated with low content of anthocyanins and organic acids, soil elements were not significantly correlated with fruits antioxidant activity by the network correlation analysis.

View Article and Find Full Text PDF

Quantitative characterization of protein conformational landscapes is a computationally challenging task due to their high dimensionality and inherent complexity. In this study, we systematically benchmark several widely used dimensionality reduction and clustering methods to analyze the conformational states of the Trp-Cage mini-protein, a model system with well-documented folding dynamics. Dimensionality reduction techniques, including Principal Component Analysis (PCA), Time-lagged Independent Component Analysis (TICA), and Variational Autoencoders (VAE), were employed to project the high-dimensional free energy landscape onto 2D spaces for visualization.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!