The ability to quantify individual components of complex mixtures is a challenge found throughout the life and physical sciences. An improved capacity to generate large data sets along with the uptake of machine-learning (ML)-based analysis tools has allowed for various "omics" disciplines to realize exceptional advances. Other areas of chemistry that deal with complex mixtures often do not leverage these advances. Environmental samples, for example, can be more difficult to access, and the resulting small data sets are less appropriate for unconstrained ML approaches. Herein, we present an approach to address this latter issue. Using a very small environmental data set─35 high-resolution mass spectra gathered from various solvent extractions of Canadian petroleum fractions─we show that the application of specific domain knowledge can lead to ML models with notable performance.

Download full-text PDF

Source
http://dx.doi.org/10.1021/jacs.4c06595DOI Listing

Publication Analysis

Top Keywords

data sets
12
domain knowledge
8
small data
8
complex mixtures
8
machine learning
4
learning complex
4
complex organic
4
organic mixtures
4
mixtures applying
4
applying domain
4

Similar Publications

Diffraction imaging of cells allows rapid phenotyping by the response of intracellular molecules to coherent illumination. However, its ability to distinguish numerous types of human leukocytes remains to be investigated. Here, we show that accurate classification of three lymphocyte subtypes can be achieved with features extracted from cross-polarized diffraction image (p-DI) pairs.

View Article and Find Full Text PDF

ModeHunter is a modular Python software package for the simulation of 3D biophysical motion across spatial resolution scales using modal analysis of elastic networks. It has been curated from our in-house Python scripts over the last 15 years, with a focus on detecting similarities of elastic motion between atomic structures, coarse-grained graphs, and volumetric data obtained from biophysical or biomedical imaging origins, such as electron microscopy or tomography. With ModeHunter, normal modes of biophysical motion can be analyzed with various static visualization techniques or brought to life by dynamics animation in terms of single or multimode trajectories or decoy ensembles.

View Article and Find Full Text PDF

Naturopathic practitioners consult an estimated 6.2% of Australian adults, equating to 1,550,000 people receiving their care each year. Sleep is now recognized as a key pillar of health; however, nearly half of all Australian adults report inadequate sleep.

View Article and Find Full Text PDF

The reduced cost of next-generation sequencing (NGS) has allowed researchers to generate nuclear and mitochondrial genome data to gain deeper insights into the phylogeography, evolutionary history and biology of non-model species. While the Cape buffalo () has been well-studied across its range with traditional genetic markers over the last 25 years, researchers are building on this knowledge by generating whole genome, population-level data sets to improve understanding of the genetic composition and evolutionary history of the species. Using publicly available NGS data, we assembled 40 Cape buffalo mitochondrial genomes (mitogenomes) from four protected areas in South Africa, expanding the geographical range and almost doubling the number of mitogenomes available for this species.

View Article and Find Full Text PDF

Multivariate patterns among multimodal neuroimaging and clinical, cognitive, and daily functioning characteristics in bipolar disorder.

Neuropsychopharmacology

January 2025

Neurocognition and Emotion in Affective Disorders (NEAD) Centre, Psychiatric Centre Copenhagen, Mental Health Services, Capital Region of Denmark, Frederiksberg, Denmark.

Individuals with bipolar disorder (BD) show heterogeneity in clinical, cognitive, and daily functioning characteristics, which challenges accurate diagnostics and optimal treatment. A key goal is to identify brain-based biomarkers that inform patient stratification and serve as treatment targets. The objective of the present study was to apply a data-driven, multivariate approach to quantify the relationship between multimodal imaging features and behavioral phenotypes in BD.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!