Standardizing clinical laboratory test results is critical for conducting clinical data science research and analysis. However, standardized data processing tools and guidelines are inadequate. In this paper, a novel approach for standardizing categorical test results based on supervised machine learning and the Jaro-Winkler similarity algorithm is proposed. A supervised machine learning model is used in this approach for scalable categorization of the test results into predefined groups or clusters, while Jaro-Winkler similarity is used to map text terms into standard clinical terms within these corresponding groups. The proposed method is applied to 75062 test results from two private hospitals in Bangladesh. The Support Vector Classification algorithm with a linear kernel has a classification accuracy of 98%, which is better than the Random Forest algorithm when categorizing test results. The experiment results show that Jaro-Winkler similarity achieves a remarkable 99.93% success rate in the test result standardization for the majority of groups with manual validation. The proposed method outperforms previous studies that concentrated on standardizing test results using rule-based classifiers on a smaller number of groups and distance similarities such as Cosine similarity or Levenshtein distance. Furthermore, when applied to the publicly available MIMIC-III dataset, our approach also performs excellently. All these findings show that the proposed standardization technique can be very beneficial for clinical big data research, particularly for national clinical research data hubs in low- and middle-income countries.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10685145PMC
http://dx.doi.org/10.1016/j.heliyon.2023.e21523DOI Listing

Publication Analysis

Top Keywords

machine learning
12
jaro-winkler similarity
12
novel approach
8
approach standardizing
8
standardizing clinical
8
clinical laboratory
8
test
8
categorical test
8
clinical data
8
supervised machine
8

Similar Publications

In the context of Chinese clinical texts, this paper aims to propose a deep learning algorithm based on Bidirectional Encoder Representation from Transformers (BERT) to identify privacy information and to verify the feasibility of our method for privacy protection in the Chinese clinical context. We collected and double-annotated 33,017 discharge summaries from 151 medical institutions on a municipal regional health information platform, developed a BERT-based Bidirectional Long Short-Term Memory Model (BiLSTM) and Conditional Random Field (CRF) model, and tested the performance of privacy identification on the dataset. To explore the performance of different substructures of the neural network, we created five additional baseline models and evaluated the impact of different models on performance.

View Article and Find Full Text PDF

Human vs Machine: The Future of Decision-making in Plastic and Reconstructive Surgery.

Aesthet Surg J

January 2025

Department of Plastic, Reconstructive and Aesthetic Surgery, Faculty of Medicine, Altınbas University, Istanbul, Turkey.

Background: Artificial intelligence (AI)-driven technologies offer transformative potential in plastic surgery, spanning pre-operative planning, surgical procedures, and post-operative care, with the promise of improved patient outcomes.

Objectives: To compare the web-based ChatGPT-4o (omni; OpenAI, San Francisco, CA) and Gemini Advanced (Alphabet Inc., Mountain View, CA), focusing on their data upload feature and examining outcomes before and after exposure to CME articles, particularly regarding their efficacy relative to human participants.

View Article and Find Full Text PDF

How Outcome Prediction Could Aid Clinical Practice.

Br J Hosp Med (Lond)

January 2025

Department of Surgery & Cancer, Imperial College London, London, UK.

Predictive algorithms have myriad potential clinical decision-making implications from prognostic counselling to improving clinical trial efficiency. Large observational (or "real world") cohorts are a common data source for the development and evaluation of such tools. There is significant optimism regarding the benefits and use cases for risk-based care, but there is a notable disparity between the volume of clinical prediction models published and implementation into healthcare systems that drive and realise patient benefit.

View Article and Find Full Text PDF

Tryptophan catabolism is a central pathway in many cancers, serving to sustain an immunosuppressive microenvironment. The key enzymes involved in this tryptophan metabolism such as indoleamine 2,3-dioxygenase 1 (IDO1) and tryptophan 2,3-dioxygenase (TDO) are reported as promising novel targets in cancer immunotherapy. IDO1 and TDO overexpression in TNBC cells promote resistance to cell death, proliferation, invasion, and metastasis.

View Article and Find Full Text PDF

Radio frequency identification (RFID) technology and marker recognition algorithms can offer an efficient and non-intrusive means of tracking animal positions. As such, they have become important tools for invertebrate behavioral research. Both approaches require fixing a tag or marker to the study organism, and so it is useful to quantify the effects such procedures have on behavior before proceeding with further research.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!