Background: Care home residents are a highly vulnerable group, but identifying care home residents in routine data is challenging. This study aimed to develop and validate Natural Language Processing (NLP) methods to identify care home residents from primary care address records.
Methods: The proposed system applies an NLP sequential filtering and preprocessing of text, then the calculation of similarity scores between general practice (GP) addresses and care home registered addresses. Performance was evaluated in a diagnostic test study comparing NLP prediction to independent, gold-standard manual identification of care home addresses. The analysis used population data for 771,588 uniquely written addresses for 819,911 people in two NHS Scotland health board regions. The source code is publicly available at https://github.com/vsuarezpaniagua/NLPcarehome.
Results: Care home resident identification by NLP methods overall was better in Fife than in Tayside, and better in the over-65s than in the whole population. Methods with the best performance were Correlation (sensitivity 90.2%, PPV 92.0%) for Fife data and Cosine (sensitivity 90.4%, PPV 93.7%) for Tayside. For people aged ≥65 years, the best methods were Jensen-Shannon (sensitivity 91.5%, PPV 98.7%) for Fife and City Block (sensitivity 94.4%, PPV 98.3%) for Tayside. These results show the feasibility of applying NLP methods to real data concluding that computing address similarities outperforms previous works.
Conclusions: Address-matching techniques using NLP methods can determine with reasonable accuracy if individuals live in a care home based on their GP-registered addresses. The performance of the system exceeds previously reported results such as Postcode matching, Markov score or Phonics score.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11620595 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0309341 | PLOS |
Int J Mol Sci
December 2024
School of Computer Science, University College Dublin (UCD), D04 V1W8 Dublin, Ireland.
Accurately predicting protein secondary structure (PSSP) is crucial for understanding protein function, which is foundational to advancements in drug development, disease treatment, and biotechnology. Researchers gain critical insights into protein folding and function within cells by predicting protein secondary structures. The advent of deep learning models, capable of processing complex sequence data and identifying meaningful patterns, offer substantial potential to enhance the accuracy and efficiency of protein structure predictions.
View Article and Find Full Text PDFComput Methods Programs Biomed
January 2025
Laberit, Avda. de Catalunya, 9, València, 46020, Spain.
Background And Objective: Despite significant investments in the normalization and the standardization of Electronic Health Records (EHRs), free text is still the rule rather than the exception in clinical notes. The use of free text has implications in data reuse methods used for supporting clinical research since the query mechanisms used in cohort definition and patient matching are mainly based on structured data and clinical terminologies. This study aims to develop a method for the secondary use of clinical text by: (a) using Natural Language Processing (NLP) for tagging clinical notes with biomedical terminology; and (b) designing an ontology that maps and classifies all the identified tags to various terminologies and allows for running phenotyping queries.
View Article and Find Full Text PDFLung Cancer
January 2025
Dept. of Medical Oncology, Princess Margaret Cancer Center, Toronto, ON, Canada.
Background: Manual extraction of real-world clinical data for research can be time-consuming and prone to error. We assessed the feasibility of using natural language processing (NLP), an AI technique, to automate data extraction for patients with advanced lung cancer (aLC). We assessed the external validity of our NLP-extracted data by comparing our findings to those reported in the literature.
View Article and Find Full Text PDFJ Med Internet Res
January 2025
Division of Sleep Medicine, Harvard Medical School, Boston, MA, United States.
Background: People share health-related experiences and treatments, such as for insomnia, in digital communities. Natural language processing tools can be leveraged to understand the terms used in digital spaces to discuss insomnia and insomnia treatments.
Objective: The aim of this study is to summarize and chart trends of insomnia treatment terms on a digital insomnia message board.
Prehosp Emerg Care
January 2025
Institute for Pharmaceutical Outcomes & Policy, Department of Pharmacy Practice and Science, College of Pharmacy, University of Kentucky, Lexington KY 40508, USA.
Objectives: Structured data fields, including medication fields involving naloxone, are routinely used to identify opioid overdoses in emergency medical services (EMS) data; between January 2021 and March 2024, there were approximately 1.2 million instances of naloxone administration. in the United States.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!