Comparison of an Ensemble of Machine Learning Models and the BERT Language Model for Analysis of Text Descriptions of Brain CT Reports to Determine the Presence of Intracranial Hemorrhage.

A N Khoruzhaya D V Kozlov K M Arzamasov E I Kremneva

Sovrem Tekhnologii Med

Leading Researcher, Department of Innovative Technologies; Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Department of Health, Bldg 1, 24 Petrovka St., Moscow, 127051, Russia; Senior Researcher; Research Center for Neurology, 80 Volokolamskoye Shosse, Moscow, 125367, Russia.

Published: October 2024

The study aimed to develop and evaluate an ensemble of machine learning models against the pre-trained BERT model for binary classification of intracranial hemorrhage signs in brain CT reports.
Data came from 3980 brain CT reports at various medical facilities, using algorithms like CatBoost and logistic regression, and text processing methods including bag of words and TF-IDF.
Results showed that the ensemble of selected algorithms achieved high sensitivity (0.93) and specificity (0.90), outperforming the BERT model in classifying medical text data.

Unlabelled: is to train and test an ensemble of machine learning models, as well as to compare its performance with the BERT language model pre-trained on medical data to perform simple binary classification, i.e., determine the presence/absence of the signs of intracranial hemorrhage (ICH) in brain CT reports.

Materials And Methods: Seven machine learning algorithms and three text vectorization techniques were selected as models to solve the binary classification problem. These models were trained on textual data represented by 3980 brain CT reports from 56 inpatient medical facilities in Moscow. The study utilized three text vectorization techniques: bag of words, TF-IDF, and word2vec. The resulting data were then processed by the following machine learning algorithms: decision tree, random forest, logistic regression, nearest neighbors, support vector machines, Catboost, and XGboost. Data analysis and pre-processing were performed using NLTK (Natural Language Toolkit, version 3.6.5), libraries for character-based and statistical processing of natural language, and Scikit-learn (version 0.24.2), a library for machine learning containing tools to tackle classification challenges. MedRuBertTiny2 was taken as a BERT transformer model pre-trained on medical data.

Results: Based on the training and testing outcomes from seven machine learning algorithms, the authors selected three algorithms that yielded the highest metrics (i.e. sensitivity and specificity): CatBoost, logistic regression, and nearest neighbors. The highest metrics were achieved by the bag of words technique. These algorithms were assembled into an ensemble using the stacking technique. The sensitivity and specificity for the validation dataset separated from the original sample were 0.93 and 0.90, respectively. Next, the ensemble and the BERT model were trained on an independent dataset containing 9393 textual radiology reports also divided into training and test sets. Once the ensemble was tested on this dataset, the resulting sensitivity and specificity were 0.92 and 0.90, respectively. The BERT model tested on these data demonstrated a sensitivity of 0.97 and a specificity of 0.90.

Conclusion: When analyzing textual reports of brain CT scans with signs of intracranial hemorrhage, the trained ensemble demonstrated high accuracy metrics. Still, manual quality control of the results is required during its application. The pre-trained BERT transformer model, additionally trained on diagnostic textual reports, demonstrated higher accuracy metrics (p<0.05). The results show promise in terms of finding specific values for both binary classification task and in-depth analysis of unstructured medical information.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11482096	PMC
http://dx.doi.org/10.17691/stm2024.16.1.03	DOI Listing

Publication Analysis

Top Keywords

machine learning

intracranial hemorrhage

learning algorithms

sensitivity specificity

ensemble machine

learning models

bert language

language model

brain reports

model pre-trained

Similar Publications

Generative Modeling of Nucleon-Nucleon Interactions.

Phys Rev Lett

December 2024

Cornell University, Ithaca, New York 14853, USA.

Pengsheng Wen Jeremy W Holt Maggie Li

Developing high-precision models of the nuclear force and propagating the associated uncertainties in quantum many-body calculations of nuclei and nuclear matter remain key challenges for ab initio nuclear theory. In this Letter, we demonstrate that generative machine learning models can construct novel instances of the nucleon-nucleon interaction when trained on existing potentials from the literature. In particular, we train the generative model on nucleon-nucleon potentials derived at second and third order in chiral effective field theory and at three different choices of the resolution scale.

View Article and Find Full Text PDF

Similar Publications

Major Adverse Kidney Events in Hospitalized Older Patients With Acute Kidney Injury: Machine Learning-Based Model Development and Validation Study.

J Med Internet Res

January 2025

Department of Nephrology, Hunan Key Laboratory of Kidney Disease and Blood Purification, The Second Xiangya Hospital of Central South University, Changsha, China.

Xiao-Qin Luo Ning-Ya Zhang Ying-Hao Deng Hong-Shen Wang Yi-Xin Kang

Background: Acute kidney injury (AKI) is a common complication in hospitalized older patients, associated with increased morbidity, mortality, and health care costs. Major adverse kidney events within 30 days (MAKE30), a composite of death, new renal replacement therapy, or persistent renal dysfunction, has been recommended as a patient-centered endpoint for clinical trials involving AKI.

Objective: This study aimed to develop and validate a machine learning-based model to predict MAKE30 in hospitalized older patients with AKI.

View Article and Find Full Text PDF

Similar Publications

Explainable Machine Learning to Predict Treatment Response in Advanced Non-Small Cell Lung Cancer.

JCO Clin Cancer Inform

January 2025

Emory University School of Medicine, Atlanta, GA.

Vinayak S Ahluwalia Ravi B Parikh

Purpose: Immune checkpoint inhibitors (ICIs) have demonstrated promise in the treatment of various cancers. Single-drug ICI therapy (immuno-oncology [IO] monotherapy) that targets PD-L1 is the standard of care in patients with advanced non-small cell lung cancer (NSCLC) with PD-L1 expression ≥50%. We sought to find out if a machine learning (ML) algorithm can perform better as a predictive biomarker than PD-L1 alone.

View Article and Find Full Text PDF

Similar Publications

Prediction of Retention Indices in LC-HRMS for Enhanced Structural Identification of Organic Micropollutants in Water: Selectivity-Based Filtration.

Anal Chem

January 2025

Separation Science Group, Department of Organic and Macromolecular Chemistry, Ghent University, Krijgslaan 281 S4bis, B-9000 Ghent, Belgium.

Ardiana Kajtazi Marin Kajtazi Maike Felipe Santos Barbetta Elena Bandini Hamed Eghbali

Addressing the global challenge of ensuring access to safe drinking water, especially in developing countries, demands cost-effective, eco-friendly, and readily available technologies. The persistence, toxicity, and bioaccumulation potential of organic pollutants arising from various human activities pose substantial hurdles. While high-performance liquid chromatography coupled with high-resolution mass spectrometry (HPLC-HRMS) is a widely utilized technique for identifying pollutants in water, the multitude of structures for a single elemental composition complicates structural identification.

View Article and Find Full Text PDF

Similar Publications

Object detection in motion management scenarios based on deep learning.

PLoS One

January 2025

School of Physical Education, Jinjiang College, Sichuan University, Chengdu, Sichuan Province, People's Republic of China.

Baocheng Pei Yanan Sun Yebiao Fu Ting Ren

In athletes' competitions and daily training, in order to further strengthen the athletes' sports level, it is usually necessary to analyze the athletes' sports actions at a specific moment, in which it is especially important to quickly and accurately identify the categories and positions of the athletes, sports equipment, field boundaries and other targets in the sports scene. However, the existing detection methods failed to achieve better detection results, and the analysis found that the reasons for this phenomenon mainly lie in the loss of temporal information, multi-targeting, target overlap, and coupling of regression and classification tasks, which makes it more difficult for these network models to adapt to the detection task in this scenario. Based on this, we propose for the first time a supervised object detection method for scenarios in the field of motion management.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!