Objective: Classification tasks are an open challenge in the field of biomedicine. While several machine-learning techniques exist to accomplish this objective, several peculiarities associated with biomedical data, especially when it comes to omics measurements, prevent their use or good performance achievements. Omics approaches aim to understand a complex biological system through systematic analysis of its content at the molecular level. On the other hand, omics data are heterogeneous, sparse and affected by the classical "curse of dimensionality" problem, i.e. having much fewer observation, samples () than omics features (). Furthermore, a major problem with multi-omics data is the imbalance either at the class or feature level. The objective of this work is to study whether feature extraction and/or feature selection techniques can improve the performances of classification machine-learning algorithms on omics measurements.
Methods: Among all omics, metabolomics has emerged as a powerful tool in cancer research, facilitating a deeper understanding of the complex metabolic landscape associated with tumorigenesis and tumor progression. Thus, we selected three publicly available metabolomics datasets, and we applied several feature extraction techniques both linear and non-linear, coupled or not with feature selection methods, and evaluated the performances regarding patient classification in the different configurations for the three datasets.
Results: We provide general workflow and guidelines on when to use those techniques depending on the characteristics of the data available. To further test the extension of our approach to other omics data, we have included a transcriptomics and a proteomics data. Overall, for all datasets, we showed that applying supervised feature selection improves the performances of feature extraction methods for classification purposes. Scripts used to perform all analyses are available at: https://github.com/Plant-Net/Metabolomic_project/.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10979063 | PMC |
http://dx.doi.org/10.1016/j.csbj.2024.03.016 | DOI Listing |
J Med Internet Res
January 2025
Section of Psychology, Health & Technology, Centre for eHealth and Wellbeing, University of Twente, Enschede, Netherlands.
To ensure that an eHealth technology fits with its intended users, other stakeholders, and the context within which it will be used, thorough development, implementation, and evaluation processes are necessary. The CeHRes (Centre for eHealth and Wellbeing Research) Roadmap is a framework that can help shape these processes. While it has been successfully used in research and practice, new developments and insights have arisen since the Roadmap's first publication in 2011, not only within the domain of eHealth but also within the different disciplines in which the Roadmap is grounded.
View Article and Find Full Text PDFJ Occup Environ Hyg
January 2025
Center for Environmental Solutions and Emergency Response, United States Environmental Protection Agency, Cincinnati, Ohio.
Chemical release data are essential for performing chemical risk assessments to understand the potential exposures arising from industrial processes. Often, these data are unknown or unavailable and must be estimated. A case study of volatile organic compound releases during extrusion-based additive manufacturing is used here to explore the viability of various regression methods for predicting chemical releases to inform chemical assessments.
View Article and Find Full Text PDFJ Phys Chem Lett
January 2025
Key Laboratory of Advanced Light Conversion Materials and Biophotonics, School of Chemistry and Life Resources, Renmin University of China, Beijing 100872, P. R. China.
Chlorophyll (Chl) is the most abundant light-harvesting pigment of oxygenic photosynthetic organisms; however, the Q-band energetics and relaxation dynamics remain unclear. In this work, we have applied femtosecond time-resolved (-TA) absorption spectroscopy in 430-1,700 nm to Chls and in diluted pyridine solutions under selective optical excitation within their Q-bands. The results revealed distinct near-infrared absorption features of the B ← Q and B ← Q transitions in 930-1,700 nm, which together with the steady-state absorption in 400-700 nm unveiled the Q-state energy that lies 1,000 ± 400 and 600 ± 400 cm above the Q-state for Chls and , respectively.
View Article and Find Full Text PDFAnn N Y Acad Sci
January 2025
Hainan Institute, Zhejiang University, Sanya, China.
In this paper, we introduce FUSION-ANN, a novel artificial neural network (ANN) designed for acoustic emission (AE) signal classification. FUSION-ANN comprises four distinct ANN branches, each housing an independent multilayer perceptron. We extract denoised features of speech recognition such as linear predictive coding, Mel-frequency cepstral coefficient, and gammatone cepstral coefficient to represent AE signals.
View Article and Find Full Text PDFPLoS Pathog
January 2025
Institute of Immunology and Infection Research, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) is a diverse family of variant surface antigens, encoded by var genes, that mediates binding of infected erythrocytes to human cells and plays a key role in parasite immune evasion and malaria pathology. The increased availability of parasite genome sequence data has revolutionised the study of PfEMP1 diversity across multiple P. falciparum isolates.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!