Pharmacoepidemiol Drug Saf
October 2024
Introduction: Rapid identification of individuals developing a psychotic spectrum disorder (PSD) is crucial because untreated psychosis is associated with poor outcomes and decreased treatment response. Lack of recognition of early psychotic symptoms often delays diagnosis, further worsening these outcomes.
Methods: The proposed study is a cross-sectional, retrospective analysis of electronic health record data including clinician documentation and patient-clinician secure messages for patients aged 15-29 years with ≥ 1 primary care encounter between 2017 and 2019 within 2 Kaiser Permanente regions.
J Am Med Inform Assoc
August 2024
Objective: To present a general framework providing high-level guidance to developers of computable algorithms for identifying patients with specific clinical conditions (phenotypes) through a variety of approaches, including but not limited to machine learning and natural language processing methods to incorporate rich electronic health record data.
Materials And Methods: Drawing on extensive prior phenotyping experiences and insights derived from 3 algorithm development projects conducted specifically for this purpose, our team with expertise in clinical medicine, statistics, informatics, pharmacoepidemiology, and healthcare data science methods conceptualized stages of development and corresponding sets of principles, strategies, and practical guidelines for improving the algorithm development process.
Results: We propose 5 stages of algorithm development and corresponding principles, strategies, and guidelines: (1) assessing fitness-for-purpose, (2) creating gold standard data, (3) feature engineering, (4) model development, and (5) model evaluation.
Objectives: Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions.
View Article and Find Full Text PDFPost marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risk under-coding and imprecision.
View Article and Find Full Text PDFPLoS One
May 2023
Objective: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique.
Materials And Methods: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs.
The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use.
View Article and Find Full Text PDFObjective: The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used.
Materials And Methods: A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries.
Results: Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts.
We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site.
View Article and Find Full Text PDFBackground: Currently available medications for chronic osteoarthritis pain are only moderately effective, and their use is limited in many patients because of serious adverse effects and contraindications. The primary surgical option for osteoarthritis is total joint replacement (TJR). The objectives of this study were to describe the treatment history of patients with osteoarthritis receiving prescription pain medications and/or intra-articular corticosteroid injections, and to estimate the incidence of TJR in these patients.
View Article and Find Full Text PDFBackground: Acute pancreatitis is a serious gastrointestinal disease that is an important target for drug safety surveillance. Little is known about the accuracy of ICD-10 codes for acute pancreatitis in the United States, or their performance in specific clinical settings. We conducted a validation study to assess the accuracy of acute pancreatitis ICD-10 diagnosis codes in inpatient, emergency department (ED), and outpatient settings.
View Article and Find Full Text PDFImportance: Patients who use cannabis for medical reasons may benefit from discussions with clinicians about health risks of cannabis and evidence-based treatment alternatives. However, little is known about the prevalence of medical cannabis use in primary care and how often it is documented in patient electronic health records (EHR).
Objective: To estimate the primary care prevalence of medical cannabis use according to confidential patient survey and to compare the prevalence of medical cannabis use documented in the EHR with patient report.
Background: Patients and their loved ones often report symptoms or complaints of cognitive decline that clinicians note in free clinical text, but no structured screening or diagnostic data are recorded. These symptoms/complaints may be signals that predict who will go on to be diagnosed with mild cognitive impairment (MCI) and ultimately develop Alzheimer's Disease or related dementias. Our objective was to develop a natural language processing system and prediction model for identification of MCI from clinical text in the absence of screening or other structured diagnostic information.
View Article and Find Full Text PDFMost states have legalized medical cannabis, yet little is known about how medical cannabis use is documented in patients' electronic health records (EHRs). We used natural language processing (NLP) to calculate the prevalence of clinician-documented medical cannabis use among adults in an integrated health system in Washington State where medical and recreational use are legal. We analyzed EHRs of patients ≥18 years old screened for past-year cannabis use (November 1, 2017-October 31, 2018), to identify clinician-documented medical cannabis use.
View Article and Find Full Text PDFIntroduction: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness.
Methods: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network.
Importance: Many people use cannabis for medical reasons despite limited evidence of therapeutic benefit and potential risks. Little is known about medical practitioners' documentation of medical cannabis use or clinical characteristics of patients with documented medical cannabis use.
Objectives: To estimate the prevalence of past-year medical cannabis use documented in electronic health records (EHRs) and to describe patients with EHR-documented medical cannabis use, EHR-documented cannabis use without evidence of medical use (other cannabis use), and no EHR-documented cannabis use.
Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate ("A-by-G" grid).
View Article and Find Full Text PDFIdentifying genetic risk factors for lumbar spine disorders may lead to knowledge regarding underlying mechanisms and the development of new treatments. We conducted a genome-wide association study involving 100,811 participants with genotypes and longitudinal electronic health record data from the Electronic Medical Records and Genomics Network and Geisinger Health. Cases and controls were defined using validated algorithms and clinical diagnostic codes.
View Article and Find Full Text PDF