A Systematic Review of Natural Language Processing Methods and Applications in Thyroidology.

Mayo Clin Proc Digit Health

Knowledge and Evaluation Research Unit (R.L.-T., M.D., M.M.C., C.S.., D.S.T., S.A.P., V.M., E.G., M.B.O., J.P.B.), Division of Endocrinology, Diabetes, Metabolism, and Nutrition (D.T.-T., J.P.B.), Department of Medicine, and Department of Artificial Intelligence and Informatics (N.S.O.), Mayo Clinic, Rochester, MN; University of Edinburgh, Edinburgh, Scotland, United Kingdom (D.S.T.); Montefiore Health Center, Albert Einstein College of Medicine, New York, NY (J.W.F.); Division of Endocrinology, Department of Medicine (N.S.O.), and Department of Health Outcomes and Biomedical Informatics (Y.W.), University of Florida, Gainesville, FL; and Respiratory, Cardiovascular, and Renal Pathobiology and Bioengineering, Universitat de Barcelona, Spain (D.S.T.).

Published: June 2024

This study aimed to review the application of natural language processing (NLP) in thyroid-related conditions and to summarize current challenges and potential future directions. We performed a systematic search of databases for studies describing NLP applications in thyroid conditions published in English between January 1, 2012 and November 4, 2022. In addition, we used a snowballing technique to identify studies missed in the initial search or published after our search timeline until April 1, 2023. For included studies, we extracted the NLP method (eg, rule-based, machine learning, deep learning, or hybrid), NLP application (eg, identification, classification, and automation), thyroid condition (eg, thyroid cancer, thyroid nodule, and functional or autoimmune disease), data source (eg, electronic health records, health forums, medical literature databases, or genomic databases), performance metrics, and stages of development. We identified 24 eligible NLP studies focusing on thyroid-related conditions. Deep learning-based methods were the most common (38%), followed by rule-based (21%), and traditional machine learning (21%) methods. Thyroid nodules (54%) and thyroid cancer (29%) were the primary conditions under investigation. Electronic health records were the dominant data source (17/24, 71%), with imaging reports being the most frequently used (15/17, 88%). There is increasing interest in NLP applications for thyroid-related studies, mostly addressing thyroid nodules and using deep learning-based methodologies with limited external validation. However, none of the reviewed NLP applications have reached clinical practice. Several limitations, including inconsistent clinical documentation and model portability, need to be addressed to promote the evaluation and implementation of NLP applications to support patient care in thyroidology.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11210322PMC
http://dx.doi.org/10.1016/j.mcpdig.2024.03.007DOI Listing

Publication Analysis

Top Keywords

nlp applications
16
natural language
8
language processing
8
nlp
8
thyroid-related conditions
8
machine learning
8
thyroid cancer
8
data source
8
electronic health
8
health records
8

Similar Publications

Semantic search helper: A tool based on the use of embeddings in multi-item questionnaires as a harmonization opportunity for merging large datasets - A feasibility study.

Eur Psychiatry

January 2025

Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Baden-Württemberg, Germany.

Background: Recent advances in natural language processing (NLP), particularly in language processing methods, have opened new avenues in semantic data analysis. A promising application of NLP is data harmonization in questionnaire-based cohort studies, where it can be used as an additional method, specifically when only different instruments are available for one construct as well as for the evaluation of potentially new construct-constellations. The present article therefore explores embedding models' potential to detect opportunities for semantic harmonization.

View Article and Find Full Text PDF

APPLICATIONS OF MATHEMATICAL PROGRAMMING TO GENETIC BIOCONTROL.

SIAM J Appl Math

January 2024

Division of Epidemiology and Biostatistics, School of Public Health, University of California Berkeley, Berkeley, CA 94704 USA.

We review existing approaches to optimizing the deployment of genetic biocontrol technologies-tools used to prevent vector-borne diseases such as malaria and dengue-and formulate a mathematical program that enables the incorporation of crucial ecological and logistical details. The model is comprised of equality constraints grounded in discretized dynamic population equations, inequality constraints representative of operational limitations including resource restrictions, and an objective function that jointly minimizes the count of competent mosquito vectors and the number of transgenic organisms released to mitigate them over a specified time period. We explore how nonlinear programming (NLP) and mixed integer nonlinear programming (MINLP) can advance the state of the art in designing the operational implementation of three distinct transgenic public health interventions, two of which are presently in active use around the world.

View Article and Find Full Text PDF

This paper introduces the Morphologically-Analyzed and Syntactically-Annotated Quran (MASAQ) dataset, a comprehensive resource designed to address the scarcity of annotated Quranic Arabic corpora and facilitate the development of advanced Natural Language Processing (NLP) models. The Quran, being a cornerstone of classical Arabic, presents unique challenges for NLP due to its sacred nature and complex linguistic features. MASAQ provides a detailed syntactic and morphological annotation of the entire Quranic text, utilizing a rigorously verified text from Tanzil.

View Article and Find Full Text PDF

Background: The use of natural language processing (NLP) in mental health research is increasing, with a wide range of applications and datasets being investigated.

Objective: This review aims to summarize the use of NLP in mental health research, with a special focus on the types of text datasets and the use of social determinants of health (SDOH) in NLP projects related to mental health.

Methods: The search was conducted in September 2024 using a broad search strategy in PubMed, Scopus, and CINAHL Complete.

View Article and Find Full Text PDF

In the vibrant linguistic landscape of Bengali, spoken by millions in Bangladesh and India, the gap between saintly and common terms is culturally and computationally significant. Recognising this, we introduce BanglaBlend, a pioneering dataset created to capture these stylistic distinctions. BanglaBlend comes with 7350 annotated sentences, 3675 in saintly form and 3675 in common form, covering a crucial need in natural language processing (NLP) resources for Bangla.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!