Background: There have been few studies describing how production EMR systems can be systematically queried to identify clinically-defined populations and limited studies utilising free-text in this process. The aim of this study is to provide a generalisable methodology for constructing clinically-defined EMR-derived patient cohorts using structured and unstructured data in EMRs.

Methods: Patients with possible acute coronary syndrome (ACS) were used as an exemplar. Cardiologists defined clinical criteria for patients presenting with possible ACS. These were mapped to data tables within the production EMR system creating seven inclusion criteria comprised of structured data fields (orders and investigations, procedures, scanned electrocardiogram (ECG) images, and diagnostic codes) and unstructured clinical documentation. Data were extracted from two local health districts (LHD) in Sydney, Australia. Outcome measures included examination of the relative contribution of individual inclusion criteria to the identification of eligible encounters, comparisons between inclusion criterion and evaluation of consistency of data extracts across years and LHDs.

Results: Among 802,742 encounters in a 5 year dataset (1/1/13-30/12/17), the presence of an ECG image (54.8% of encounters) and symptoms and keywords in clinical documentation (41.4-64.0%) were used most often to identify presentations of possible ACS. Orders and investigations (27.3%) and procedures (1.4%), were less often present for identified presentations. Relevant ICD-10/SNOMED CT codes were present for 3.7% of identified encounters. Similar trends were seen when the two LHDs were examined separately, and across years.

Conclusions: Clinically-defined EMR-derived cohorts combining structured and unstructured data during cohort identification is a necessary prerequisite for critical validation work required for development of real-time clinical decision support and learning health systems.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938556PMC
http://dx.doi.org/10.1186/s12911-021-01441-wDOI Listing

Publication Analysis

Top Keywords

structured unstructured
12
unstructured data
12
clinically-defined emr-derived
12
combining structured
8
emr-derived cohorts
8
production emr
8
inclusion criteria
8
orders investigations
8
clinical documentation
8
data
7

Similar Publications

Article Synopsis
  • Delayed surgeries significantly increase the risk of disease progression and negative outcomes in lung cancer patients, particularly those with Non-Small Cell Lung Cancer (NSCLC).
  • The study analyzed data from 3,088 NSCLC patients, revealing that over 40% experienced delays in surgical treatment due to geographic variability and various risk factors.
  • Key factors contributing to these delays included advanced cancer stages, treatment at specific regional hospitals, existing health conditions, and diagnoses made during the COVID-19 pandemic.
View Article and Find Full Text PDF

Amid an aging global population, heart failure has become a leading cause of hospitalization among older people. Its high prevalence and mortality rates underscore the importance of accurate mortality prediction for swift disease progression assessment and better patient outcomes. The evolution of artificial intelligence (AI) presents new avenues for predicting heart failure mortality.

View Article and Find Full Text PDF

Identification of Naloxone in Emergency Medical Services Data Substantially Improves by Processing Unstructured Patient Care Narratives.

Prehosp Emerg Care

January 2025

Institute for Pharmaceutical Outcomes & Policy, Department of Pharmacy Practice and Science, College of Pharmacy, University of Kentucky, Lexington KY 40508, USA.

Objectives: Structured data fields, including medication fields involving naloxone, are routinely used to identify opioid overdoses in emergency medical services (EMS) data; between January 2021 and March 2024, there were approximately 1.2 million instances of naloxone administration. in the United States.

View Article and Find Full Text PDF

Background: Natural language processing (NLP) and machine learning (ML) techniques may help harness unstructured free-text electronic health record (EHR) data to detect adverse drug events (ADEs) and thus improve pharmacovigilance. However, evidence of their real-world effectiveness remains unclear.

Objective: To summarise the evidence on the effectiveness of NLP/ML in detecting ADEs from unstructured EHR data and ultimately improve pharmacovigilance in comparison to other data sources.

View Article and Find Full Text PDF

Background: Natural language processing (NLP) enables the extraction of information embedded within unstructured texts, such as clinical case reports and trial eligibility criteria. By identifying relevant medical concepts, NLP facilitates the generation of structured and actionable data, supporting complex tasks like cohort identification and the analysis of clinical records. To accomplish those tasks, we introduce a deep learning-based and lexicon-based named entity recognition (NER) tool for texts in Spanish.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!