Severity: Warning
Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 176
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 316
Function: require_once
Background: Machine learning algorithms for bacterial strain identification using Raman spectroscopy have been widely used in microbiology. During the training phase, existing datasets are augmented and used to optimize model architecture and hyperparameters. After training, it is presumed that the models have reached their peak performance and are used for inference without being further enhanced. Our methodology combines Monte Carlo Dropout (MCD) with convolutional neural networks (CNNs) by utilizing dropout during the inference phase, which enables to measure the model uncertainty, a critical but often ignored aspect in deep learning models.
Results: We categorize unseen input data into two subsets based on the uncertainty of their prediction by employing MCD and defining the threshold using the Gaussian Mixture Model (GMM). The final prediction is obtained on the subset of testing data that exhibits lower model uncertainty, thereby enhancing the reliability of the results. To validate our method, we applied it to two Raman spectra datasets. As a result, we have observed an increase in accuracy of 9 % for Dataset 1 (from 83.10 % to 92.10 %) and 12.82 % for Dataset 2 (from 83.86 % to 96.68 %). These improvements were observed within specific subsets of the data: 826 out of 1206 spectra in Dataset 1 and 1700 out of 3000 spectra in Dataset 2. This demonstrates the effectiveness of our approach in improving prediction accuracy by focusing on data with lower uncertainty.
Significance: Different from routine prediction based on mere probabilities, we believe this uncertainty-guided prediction is more effective to ensure a high prediction rate rather than the prediction on the entire dataset. By guiding the decision-making of a model on higher-confidence subsets, our methodology can enhance the accuracy of classification in critical areas like disease diagnosis and safety monitoring. This targeted approach is to advance microbial identification and produces more trustworthy predictions.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.aca.2024.343346 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!