Background: Environmental health researchers often aim to identify sources or behaviors that give rise to potentially harmful environmental exposures.

Objective: We adapted principal component pursuit (PCP)-a robust and well-established technique for dimensionality reduction in computer vision and signal processing-to identify patterns in environmental mixtures. PCP decomposes the exposure mixture into a low-rank matrix containing consistent patterns of exposure across pollutants and a sparse matrix isolating unique or extreme exposure events.

Methods: We adapted PCP to accommodate nonnegative data, missing data, and values below a given limit of detection (LOD). We simulated data to represent environmental mixtures of two sizes with increasing proportions and three noise structures. We applied PCP-LOD to evaluate its performance in comparison with principal component analysis (PCA). We next applied principal component pursuit with limit of detection (PCP-LOD) to an exposure mixture of 21 persistent organic pollutants (POPs) measured in 1,000 U.S. adults from the 2001-2002 National Health and Nutrition Examination Survey (NHANES). We applied singular value decomposition to the estimated low-rank matrix to characterize the patterns.

Results: PCP-LOD recovered the true number of patterns through cross-validation for all simulations; based on an specified criterion, PCA recovered the true number of patterns in 32% of simulations. PCP-LOD achieved lower relative predictive error than PCA for all simulated data sets with up to 50% of the data . When 75% of values were , PCP-LOD outperformed PCA only when noise was low. In the POP mixture, PCP-LOD identified a rank-three underlying structure and separated 6% of values as extreme events. One pattern represented comprehensive exposure to all POPs. The other patterns grouped chemicals based on known structure and toxicity.

Discussion: PCP-LOD serves as a useful tool to express multidimensional exposures as consistent patterns that, if found to be related to adverse health, are amenable to targeted public health messaging. https://doi.org/10.1289/EHP10479.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9683097PMC
http://dx.doi.org/10.1289/EHP10479DOI Listing

Publication Analysis

Top Keywords

principal component
16
component pursuit
12
environmental mixtures
12
exposure mixture
8
low-rank matrix
8
consistent patterns
8
limit detection
8
simulated data
8
recovered true
8
true number
8

Similar Publications

High-dimensional partially linear functional Cox models.

Biometrics

January 2025

School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China.

As a commonly employed method for analyzing time-to-event data involving functional predictors, the functional Cox model assumes a linear relationship between the functional principal component (FPC) scores of the functional predictors and the hazard rates. However, in practical scenarios, such as our study on the survival time of kidney transplant recipients, this assumption often fails to hold. To address this limitation, we introduce a class of high-dimensional partially linear functional Cox models, which accommodates the non-linear effects of functional predictors on the response and allows for diverging numbers of scalar predictors and FPCs as the sample size increases.

View Article and Find Full Text PDF

Genetic Mechanism Analysis Related to Cold Tolerance of Red Swamp Crayfish, Procambarus clarkii.

Mar Biotechnol (NY)

January 2025

Key Laboratory of Efficient Utilization of Non-grain Feed Resources (Co-construction by Ministry and Province) of Ministry of Agriculture and Rural Affairs, Shandong Agricultural University, Taian, Shandong, China.

In China, the red swamp crayfish (Procambarus clarkii), a notorious invasive species, has become an important economic freshwater species. In order to compare the genetic diversity and population structure of crayfish from northern and southern China, we collected 60 crayfish individuals from 4 crayfish populations in northern China and 2 populations in southern China for sequencing using the 2b-RAD technique. Additionally, the whole genome sequence information obtained by 2b-RAD of 90 individuals from 2 populations in northern China and 7 populations in southern China were downloaded from NCBI.

View Article and Find Full Text PDF

Identification of plant-based spilled oils using direct analysis in real-time-time-of-flight mass spectrometry with hydrophobic paper sampling.

Environ Monit Assess

January 2025

Science and Technology Branch, Pacific Environmental Science Centre, Environment and Climate Change Canada, Pacific and Yukon Laboratory for Environmental Testing, North Vancouver, BC, Canada.

Spilled plant-based oils behave very differently in comparison to petroleum oils and require different clean-up measures. They do not evaporate, disperse, dissolve, or emulsify to a significant degree but can polymerize and form an impermeable cap on sediment, smothering benthic media and resulting in an immediate impact on the wildlife community. The current study explored the application of rapid up-to-date direct analysis in real time (DART) with high-resolution mass spectrometry for plant-based oil typing.

View Article and Find Full Text PDF

This research assesses heavy metal contamination within the riparian zone of the Danro River, a tributary of the Ganges River basin in India, particularly impacted by sand mining activities. The study conducted analyses on major and trace elements in soil samples, focusing on those identified as ecologically hazardous by the Water Framework Directive of India. Utilizing a combination of indices (Enrichment Factor, Pollution Load Index, and Index of geo-accumulation) and statistical techniques such as Principal Component Analysis (PCA), the investigation aimed to evaluate contamination severity, ecological risks, and pollution sources.

View Article and Find Full Text PDF

Arbuscular mycorrhizal Fungi (AMF) are essential in agriculture and are often inter-linked with glomalin-related soil protein (GRSP) production which supports binding of aggregates, enhanced SOC and biological attributes. However, conservation agricultural practices in agroecosystem may have significant impact on AMF diversity, GRSP and soil quality-related parameters (SQRPs). This current experiment was implemented to gauge AMF conization percentage (AMF-CP), GSRP and significant changes on critical SQRPs, and to investigate the linkages between AMF-CP, GRSP and SQRPs as influenced by synergistic tillage and weed management in CA.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!