Publications by authors named "David Carrell"

Article Synopsis
  • - The FDA's Sentinel Innovation Center created a quality-checked network using electronic health records (EHRs) and insurance claims data from over 10 million individuals to enhance regulatory decision-making with real-world data.
  • - The resulting network, called the Real-World Evidence Data Enterprise (RWE-DE), includes data from two commercial sources covering 21 million lives and four academic partners covering 4.5 million lives.
  • - The report details data completeness, patient populations, and a process for managing free-text notes, while also highlighting potential use cases for RWE-DE to address broader questions in healthcare regulation.
View Article and Find Full Text PDF
Article Synopsis
  • Post-marketing safety surveillance can be improved by detecting clinical events through spontaneous reporting, but it requires healthcare professionals to be well-informed and aware of the reporting process.
  • The study introduces a new method for identifying incidents using unstructured clinical data and natural language processing, validated against traditional methods for two specific health concerns: suicide attempts and sleep-related behaviors.
  • Results showed that while the new approach effectively identified suicide attempts with decent precision, it struggled more with sleep-related behaviors; additionally, performance varied by race, highlighting the need for careful monitoring and bias reduction in healthcare AI applications.
View Article and Find Full Text PDF

Introduction: Rapid identification of individuals developing a psychotic spectrum disorder (PSD) is crucial because untreated psychosis is associated with poor outcomes and decreased treatment response. Lack of recognition of early psychotic symptoms often delays diagnosis, further worsening these outcomes.

Methods: The proposed study is a cross-sectional, retrospective analysis of electronic health record data including clinician documentation and patient-clinician secure messages for patients aged 15-29 years with ≥ 1 primary care encounter between 2017 and 2019 within 2 Kaiser Permanente regions.

View Article and Find Full Text PDF

Objective: To present a general framework providing high-level guidance to developers of computable algorithms for identifying patients with specific clinical conditions (phenotypes) through a variety of approaches, including but not limited to machine learning and natural language processing methods to incorporate rich electronic health record data.

Materials And Methods: Drawing on extensive prior phenotyping experiences and insights derived from 3 algorithm development projects conducted specifically for this purpose, our team with expertise in clinical medicine, statistics, informatics, pharmacoepidemiology, and healthcare data science methods conceptualized stages of development and corresponding sets of principles, strategies, and practical guidelines for improving the algorithm development process.

Results: We propose 5 stages of algorithm development and corresponding principles, strategies, and guidelines: (1) assessing fitness-for-purpose, (2) creating gold standard data, (3) feature engineering, (4) model development, and (5) model evaluation.

View Article and Find Full Text PDF

Objectives: Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions.

View Article and Find Full Text PDF

Post marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risk under-coding and imprecision.

View Article and Find Full Text PDF
Article Synopsis
  • Clostridioides difficile infection (CDI) is a major cause of diarrhea in hospitals across North America and Europe, leading to significant health risks.
  • Previous risk factors don't fully explain why some people get CDI while others don't, suggesting a genetic component to susceptibility.
  • A study involving nearly 20,000 participants found that variations in the DRB locus of the MHC (HLA) II region may increase the likelihood of developing CDI, indicating that genetic factors could influence how the body responds to this infection.
View Article and Find Full Text PDF

Objective: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique.

Materials And Methods: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs.

View Article and Find Full Text PDF

The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use.

View Article and Find Full Text PDF

Objective: The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used.

Materials And Methods: A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries.

Results: Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts.

View Article and Find Full Text PDF

We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site.

View Article and Find Full Text PDF

Background: Currently available medications for chronic osteoarthritis pain are only moderately effective, and their use is limited in many patients because of serious adverse effects and contraindications. The primary surgical option for osteoarthritis is total joint replacement (TJR). The objectives of this study were to describe the treatment history of patients with osteoarthritis receiving prescription pain medications and/or intra-articular corticosteroid injections, and to estimate the incidence of TJR in these patients.

View Article and Find Full Text PDF

Background: Acute pancreatitis is a serious gastrointestinal disease that is an important target for drug safety surveillance. Little is known about the accuracy of ICD-10 codes for acute pancreatitis in the United States, or their performance in specific clinical settings. We conducted a validation study to assess the accuracy of acute pancreatitis ICD-10 diagnosis codes in inpatient, emergency department (ED), and outpatient settings.

View Article and Find Full Text PDF

Importance: Patients who use cannabis for medical reasons may benefit from discussions with clinicians about health risks of cannabis and evidence-based treatment alternatives. However, little is known about the prevalence of medical cannabis use in primary care and how often it is documented in patient electronic health records (EHR).

Objective: To estimate the primary care prevalence of medical cannabis use according to confidential patient survey and to compare the prevalence of medical cannabis use documented in the EHR with patient report.

View Article and Find Full Text PDF

Background: Patients and their loved ones often report symptoms or complaints of cognitive decline that clinicians note in free clinical text, but no structured screening or diagnostic data are recorded. These symptoms/complaints may be signals that predict who will go on to be diagnosed with mild cognitive impairment (MCI) and ultimately develop Alzheimer's Disease or related dementias. Our objective was to develop a natural language processing system and prediction model for identification of MCI from clinical text in the absence of screening or other structured diagnostic information.

View Article and Find Full Text PDF

Most states have legalized medical cannabis, yet little is known about how medical cannabis use is documented in patients' electronic health records (EHRs). We used natural language processing (NLP) to calculate the prevalence of clinician-documented medical cannabis use among adults in an integrated health system in Washington State where medical and recreational use are legal. We analyzed EHRs of patients ≥18 years old screened for past-year cannabis use (November 1, 2017-October 31, 2018), to identify clinician-documented medical cannabis use.

View Article and Find Full Text PDF

Introduction: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness.

Methods: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network.

View Article and Find Full Text PDF
Article Synopsis
  • Genetic testing in individuals without symptoms can reveal carriers of harmful arrhythmia gene variants, but the clinical implications of these findings are still not fully understood.
  • In a study of nearly 22,000 participants, 0.6% were found to carry pathogenic or likely pathogenic variants linked to arrhythmias, with many displaying significant arrhythmia-related health records.
  • Follow-up investigations showed that variant results led to new diagnoses in some individuals, highlighting the potential for genome sequencing to uncover important health information.
View Article and Find Full Text PDF

Importance: Many people use cannabis for medical reasons despite limited evidence of therapeutic benefit and potential risks. Little is known about medical practitioners' documentation of medical cannabis use or clinical characteristics of patients with documented medical cannabis use.

Objectives: To estimate the prevalence of past-year medical cannabis use documented in electronic health records (EHRs) and to describe patients with EHR-documented medical cannabis use, EHR-documented cannabis use without evidence of medical use (other cannabis use), and no EHR-documented cannabis use.

View Article and Find Full Text PDF

Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate ("A-by-G" grid).

View Article and Find Full Text PDF

Identifying genetic risk factors for lumbar spine disorders may lead to knowledge regarding underlying mechanisms and the development of new treatments. We conducted a genome-wide association study involving 100,811 participants with genotypes and longitudinal electronic health record data from the Electronic Medical Records and Genomics Network and Geisinger Health. Cases and controls were defined using validated algorithms and clinical diagnostic codes.

View Article and Find Full Text PDF