A novel approach for standardizing clinical laboratory categorical test results using machine learning and string distance similarity.

Syed Ahmmed M Rubaiyat Hossain Mondal Md Raihan Mia Mohammad Adibuzzaman Abu Sayed Md Latiful Hoque Sheikh Iqbal Ahamed

Heliyon

Department of Computer Science, Marquette University, Milwaukee, WI, USA.

Published: November 2023

Standardizing clinical laboratory test results is critical for conducting clinical data science research and analysis. However, standardized data processing tools and guidelines are inadequate. In this paper, a novel approach for standardizing categorical test results based on supervised machine learning and the Jaro-Winkler similarity algorithm is proposed. A supervised machine learning model is used in this approach for scalable categorization of the test results into predefined groups or clusters, while Jaro-Winkler similarity is used to map text terms into standard clinical terms within these corresponding groups. The proposed method is applied to 75062 test results from two private hospitals in Bangladesh. The Support Vector Classification algorithm with a linear kernel has a classification accuracy of 98%, which is better than the Random Forest algorithm when categorizing test results. The experiment results show that Jaro-Winkler similarity achieves a remarkable 99.93% success rate in the test result standardization for the majority of groups with manual validation. The proposed method outperforms previous studies that concentrated on standardizing test results using rule-based classifiers on a smaller number of groups and distance similarities such as Cosine similarity or Levenshtein distance. Furthermore, when applied to the publicly available MIMIC-III dataset, our approach also performs excellently. All these findings show that the proposed standardization technique can be very beneficial for clinical big data research, particularly for national clinical research data hubs in low- and middle-income countries.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10685145	PMC
http://dx.doi.org/10.1016/j.heliyon.2023.e21523	DOI Listing

Publication Analysis

Top Keywords

machine learning

jaro-winkler similarity

novel approach

approach standardizing

standardizing clinical

clinical laboratory

test

categorical test

clinical data

supervised machine

Similar Publications

Identifying protected health information by transformers-based deep learning approach in Chinese medical text.

Health Informatics J

January 2025

School of Medicine and Health Management in Huazhong University of Science and Technology, Wuhan, China.

Kun Xu Yang Song Jingdong Ma

In the context of Chinese clinical texts, this paper aims to propose a deep learning algorithm based on Bidirectional Encoder Representation from Transformers (BERT) to identify privacy information and to verify the feasibility of our method for privacy protection in the Chinese clinical context. We collected and double-annotated 33,017 discharge summaries from 151 medical institutions on a municipal regional health information platform, developed a BERT-based Bidirectional Long Short-Term Memory Model (BiLSTM) and Conditional Random Field (CRF) model, and tested the performance of privacy identification on the dataset. To explore the performance of different substructures of the neural network, we created five additional baseline models and evaluated the impact of different models on performance.

View Article and Find Full Text PDF

Similar Publications

Human vs Machine: The Future of Decision-making in Plastic and Reconstructive Surgery.

Aesthet Surg J

January 2025

Department of Plastic, Reconstructive and Aesthetic Surgery, Faculty of Medicine, Altınbas University, Istanbul, Turkey.

Alpay Duran Anıl Demiröz Oguz Çörtük Bora Ok Mustafa Özten

Background: Artificial intelligence (AI)-driven technologies offer transformative potential in plastic surgery, spanning pre-operative planning, surgical procedures, and post-operative care, with the promise of improved patient outcomes.

Objectives: To compare the web-based ChatGPT-4o (omni; OpenAI, San Francisco, CA) and Gemini Advanced (Alphabet Inc., Mountain View, CA), focusing on their data upload feature and examining outcomes before and after exposure to CME articles, particularly regarding their efficacy relative to human participants.

View Article and Find Full Text PDF

Similar Publications

How Outcome Prediction Could Aid Clinical Practice.

Br J Hosp Med (Lond)

January 2025

Department of Surgery & Cancer, Imperial College London, London, UK.

Ashley Kieran Clift

Predictive algorithms have myriad potential clinical decision-making implications from prognostic counselling to improving clinical trial efficiency. Large observational (or "real world") cohorts are a common data source for the development and evaluation of such tools. There is significant optimism regarding the benefits and use cases for risk-based care, but there is a notable disparity between the volume of clinical prediction models published and implementation into healthcare systems that drive and realise patient benefit.

View Article and Find Full Text PDF

Similar Publications

Identification of putative Indoleamine 2,3-dioxygenase 1 (IDO1) and tryptophan 2,3-dioxygenase (TDO) dual inhibitors for triple-negative breast cancer therapy.

J Biomol Struct Dyn

January 2025

Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India.

Priyanga Paranthaman Shanthi Veerappapillai

Tryptophan catabolism is a central pathway in many cancers, serving to sustain an immunosuppressive microenvironment. The key enzymes involved in this tryptophan metabolism such as indoleamine 2,3-dioxygenase 1 (IDO1) and tryptophan 2,3-dioxygenase (TDO) are reported as promising novel targets in cancer immunotherapy. IDO1 and TDO overexpression in TNBC cells promote resistance to cell death, proliferation, invasion, and metastasis.

View Article and Find Full Text PDF

Similar Publications

Measuring the effect of RFID and marker recognition tags on cockroach (Blattodea: Blaberidae) behavior using AI-aided tracking.

J Insect Sci

January 2025

School of Biological Sciences, University of Aberdeen, King's College, Aberdeen, UK.

Callum J McLean David N Fisher

Radio frequency identification (RFID) technology and marker recognition algorithms can offer an efficient and non-intrusive means of tracking animal positions. As such, they have become important tools for invertebrate behavioral research. Both approaches require fixing a tag or marker to the study organism, and so it is useful to quantify the effects such procedures have on behavior before proceeding with further research.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!