Metagenomic Geolocation Prediction Using an Adaptive Ensemble Classifier.

Samuel Anyaso-Samuel Archie Sachdeva Subharup Guha Somnath Datta

Front Genet

Department of Biostatistics, University of Florida, Gainesville, FL, United States.

Published: April 2021

Microbiome samples harvested from urban environments can be informative in predicting the geographic location of unknown samples. The idea that different cities may have geographically disparate microbial signatures can be utilized to predict the geographical location based on city-specific microbiome samples. We implemented this idea first; by utilizing standard bioinformatics procedures to pre-process the raw metagenomics samples provided by the CAMDA organizers. We trained several component classifiers and a robust ensemble classifier with data generated from taxonomy-dependent and taxonomy-free approaches. Also, we implemented class weighting and an optimal oversampling technique to overcome the class imbalance in the primary data. In each instance, we observed that the component classifiers performed differently, whereas the ensemble classifier consistently yielded optimal performance. Finally, we predicted the source cities of mystery samples provided by the organizers. Our results highlight the unreliability of restricting the classification of metagenomic samples to source origins to a single classification algorithm. By combining several component classifiers via the ensemble approach, we obtained classification results that were as good as the best-performing component classifier.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8093763	PMC
http://dx.doi.org/10.3389/fgene.2021.642282	DOI Listing

Publication Analysis

Top Keywords

ensemble classifier

component classifiers

microbiome samples

samples provided

samples

metagenomic geolocation

geolocation prediction

prediction adaptive

ensemble

adaptive ensemble

Similar Publications

Clinical impact of an explainable machine learning with amino acid PET imaging: application to the diagnosis of aggressive glioma.

Eur J Nucl Med Mol Imaging

January 2025

IADI, U1254, Inserm, Université de Lorraine, Nancy, France.

Shamimeh Ahrari Timothée Zaragori Adeline Zinsz Gabriela Hossu Julien Oster

Purpose: Radiomics-based machine learning (ML) models of amino acid positron emission tomography (PET) images have shown efficiency in glioma prediction tasks. However, their clinical impact on physician interpretation remains limited. This study investigated whether an explainable radiomics model modifies nuclear physicians' assessment of glioma aggressiveness at diagnosis.

View Article and Find Full Text PDF

Similar Publications

Automated Classification of Cardiac Arrhythmia using Short-Duration ECG Signals and Machine Learning.

Biomed Phys Eng Express

January 2025

Electronics and Communication Engineering, Rajiv Gandhi University, Rono Hills, Doimukh, ITANAGAR, Itanagar, Arunachal Pradesh, 791112, INDIA.

Amar Bahadur Biswakarma Jagdeep Rahul Kurmendra Kurmendra

Accurate detection of cardiac arrhythmias is crucial for preventing premature deaths. The current study employs a dual-stage Discrete Wavelet Transform (DWT) and a median filter to eliminate noise from ECG signals. Subsequently, ECG signals are segmented, and QRS regions are extracted for further preprocessing.

View Article and Find Full Text PDF

Similar Publications

Multiclass Synthetic Accessibility Prediction.

J Chem Inf Model

January 2025

X-Chem Global HQ, 100 Beaver Street, Waltham, Massachusetts 02453, United States.

Xinqi Li Ryan Walsh Waseem Abbas Sergio Pascual-Diaz Calum Hand

Evaluating synthetic accessibility of molecules is an integral component of the drug discovery process. While the application of machine learning models to predict whether small molecules are easy or hard to synthesize has gained attention recently, predetermined thresholds and data set imbalances present challenges for these binary classification approaches. In this study, we introduce a novel multiclass fold-ensembled classification approach to predict the minimum number of steps needed to synthesize a small molecule.

View Article and Find Full Text PDF

Similar Publications

Exploring emotional patterns in social media through NLP models to unravel mental health insights.

Healthc Technol Lett

January 2025

Department of Computer Science and Engineering Manipal Institute of Technology Manipal Academy of Higher Education Manipal Karnataka India.

Nisha P Shetty Yashraj Singh Veeraj Hegde D Cenitta Dhruthi K

This study aimed to develop an advanced ensemble approach for automated classification of mental health disorders in social media posts. The research question was: can an ensemble of fine-tuned transformer models (XLNet, RoBERTa, and ELECTRA) with Bayesian hyperparameter optimization improve the accuracy of mental health disorder classification in social media text. Three transformer models (XLNet, RoBERTa, and ELECTRA) were fine-tuned on a dataset of social media posts labelled with 15 distinct mental health disorders.

View Article and Find Full Text PDF

Similar Publications

Deep learning methods for improving the accuracy and efficiency of pathological image analysis.

Sci Prog

January 2025

School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, China.

Tangsen Huang Xingru Huang Haibing Yin

This study presents a novel integration of two advanced deep learning models, U-Net and EfficientNetV2, to achieve high-precision segmentation and rapid classification of pathological images. A key innovation is the development of a new heatmap generation algorithm, which leverages meticulous image preprocessing, data enhancement strategies, ensemble learning, attention mechanisms, and deep feature fusion techniques. This algorithm not only produces highly accurate and interpretatively rich heatmaps but also significantly improves the accuracy and efficiency of pathological image analysis.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!