Publications by authors named "Matthew Spotnitz"

Article Synopsis
  • Large language models (LLMs) show potential in summarizing medical evidence, but using proprietary models can lead to issues like lack of transparency and reliance on specific vendors.
  • This study focused on enhancing the performance of open-source LLMs by fine-tuning three models—PRIMERA, LongT5, and Llama-2—using a dataset of 8,161 systematic reviews and summaries.
  • Fine-tuning resulted in significant performance improvements, with LongT5 performing similarly to GPT-3.5 in certain settings, indicating that smaller models can outperform larger models in specific tasks, like summarizing medical evidence.
View Article and Find Full Text PDF
Article Synopsis
  • Large language models (LLMs) show potential in summarizing medical evidence but are often limited by issues such as lack of transparency when using proprietary models.
  • This study examines the effects of fine-tuning open-source LLMs like PRIMERA, LongT5, and Llama-2 to enhance their performance, using a dataset of systematic reviews and summaries.
  • Results indicate that fine-tuning improves the performance of open-source models, with LongT5 performing nearly as well as GPT-3.5, and smaller fine-tuned models sometimes outperforming larger models in evaluations.
View Article and Find Full Text PDF

Purpose: The specific aims of this paper are to (1) develop and operationalize an electronic health record (EHR) data quality framework, (2) apply the dimensions of the framework to the phenotype and treatment pathways of ductal carcinoma in situ (DCIS) using Research Program data, and (3) propose and apply a checklist to evaluate the application of the framework.

Methods: We developed a framework of five data quality dimensions (DQD; completeness, concordance, conformance, plausibility, and temporality). Participants signed a consent and Health Insurance Portability and Accountability Act authorization to share EHR data and responded to demographic questions in the Basics questionnaire.

View Article and Find Full Text PDF

Introduction: Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the All of Us Research Program (All of Us).

Methods: We developed reference charts for adult heights and weights that were stratified on participant sex.

View Article and Find Full Text PDF
Article Synopsis
  • C2Q 3.0 is a new system that uses GPT-4 technology to automate the process of identifying eligible patients for clinical trials by turning trial eligibility texts into database queries.* -
  • The system's performance was tested through concept extraction from clinical trials, where it scored 0.891 for accuracy, and it found multiple errors in the SQL queries generated, with logic errors being the most frequent.* -
  • Overall, while C2Q 3.0 showed strong coherence in reasoning, there’s still room for improvement in readability, highlighting the need for further research to enhance the reliability of AI in clinical settings.*
View Article and Find Full Text PDF

Background: Current hemovigilance methods generally rely on survey data or administrative claims data utilizing billing and revenue codes, each of which has limitations. We used electronic health records (EHR) linked to blood bank data to comprehensively characterize red blood cell (RBC) utilization patterns and trends in three healthcare systems participating in the U.S.

View Article and Find Full Text PDF
Article Synopsis
  • Patients with drug-resistant epilepsy (DRE) need thorough neurodiagnostic evaluations, but there are significant delays in referrals and underutilization of surgery, particularly in diverse US settings.
  • This study seeks to analyze the rates and factors influencing neurodiagnostic evaluations for DRE patients across three different US cohorts using extensive medical data.
  • The findings reveal low rates of comprehensive evaluations among DRE patients, with only about 4.5% in the Medicaid cohort, 8.0% in the commercial insurance cohort, and 14.3% at Columbia University Medical Center.
View Article and Find Full Text PDF
Article Synopsis
  • Large language models (LLMs) like ChatGPT show potential for various clinical applications, but few healthcare providers have shared their views on their suitability for use in practice.
  • A survey of 30 practicing clinicians explored their comfort levels with LLMs across 23 tasks; 16 of these tasks received positive feedback from over 50% of respondents, highlighting their strong synthesis skills and efficiency.
  • While clinicians are supportive of using LLMs, especially in assistive roles, they expressed concerns regarding the accuracy and biases associated with the information generated by these models.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers wanted to find out if using data from electronic health records is better than looking at patient charts for studying health conditions.
  • They created a tool called KEEPER that organizes important health information to help understand these conditions faster and more clearly.
  • The results showed that KEEPER helps doctors agree on patient diagnoses more often and does it in half the time compared to traditional chart reviews.
View Article and Find Full Text PDF
Article Synopsis
  • The study addresses challenges in selecting computational phenotypes for research by proposing a novel metadata framework to improve retrieval and reuse of these phenotypes.
  • Twenty active researchers contributed to identifying 39 relevant metadata elements, which were then evaluated through surveys and annotation tasks involving type-2 diabetes mellitus phenotypes.
  • Results showed over 90% satisfaction with the framework's utility, highlighting effectiveness in phenotype description and validation, while noting challenges in data collection complexity and associated costs.
View Article and Find Full Text PDF
Article Synopsis
  • The study investigates COVID-19 patients with chronic obstructive pulmonary disease (COPD) using data from 13 databases across North America, Europe, and Asia between January and June 2020.
  • It examines two groups of COVID-19 patients: those diagnosed with COVID-19 and those hospitalized, highlighting the prevalence of COPD among these groups and noting higher comorbidities and mortality rates in hospitalized patients.
  • Key findings reveal significant variations in COPD prevalence by region, increased risk of severe outcomes like ARDS and sepsis in hospitalized patients, and the need for further research to identify high-risk COPD patients.
View Article and Find Full Text PDF
Article Synopsis
  • * Methods: Researchers analyzed data from 600 patients categorized into different epilepsy types based on a manual review of electronic health records, testing various demographic factors and treatment codes for their relationship with DRE.
  • * Results: Out of 412 epilepsy patients, 15% were identified as having DRE, with the most effective identification method yielding a high specificity but moderate sensitivity, highlighting trade-offs in different classification algorithms.
View Article and Find Full Text PDF

Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations.

View Article and Find Full Text PDF

Easy access to large quantities of accurate health data is required to understand medical and scientific information in real-time; evaluate public health measures before, during, and after times of crisis; and prevent medical errors. Introducing a system in the USA that allows for efficient access to such health data and ensures auditability of data facts, while avoiding data silos, will require fundamental changes in current practices. Here, we recommend the implementation of standardized data collection and transmission systems, universal identifiers for individual patients and end users, a reference standard infrastructure to support calibration and integration of laboratory results from equivalent tests, and modernized working practices.

View Article and Find Full Text PDF

Measurement concepts are essential to observational healthcare research; however, a lack of concept harmonization limits the quality of research that can be done on multisite research networks. We developed five methods that used a combination of automated, semi-automated and manual approaches for generating measurement concept sets. We validated our concept sets by calculating their frequencies in cohorts from the Columbia University Irving Medical Center (CUIMC) database.

View Article and Find Full Text PDF
Article Synopsis
  • The study emphasizes the importance of real world data (RWD) for understanding and responding to the COVID-19 pandemic using a standardized approach through the CHARYBDIS framework.
  • Researchers conducted a retrospective database study across multiple countries, including the US and parts of Europe and Asia, involving over 4.5 million individuals and focusing on their clinical characteristics and outcomes.
  • Findings reveal higher diagnoses among women but more hospitalizations among men, common comorbidities like diabetes and heart disease, and key symptoms such as cough and fever; this data helps to identify trends in COVID-19 across different populations and time periods.
View Article and Find Full Text PDF

Introduction: Efforts to characterize variability in epilepsy treatment pathways are limited by the large number of possible antiseizure medication (ASM) regimens and sequences, heterogeneity of patients, and challenges of measuring confounding variables and outcomes across institutions. The Observational Health Data Science and Informatics (OHDSI) collaborative is an international data network representing over 1 billion patient records using common data standards. However, few studies have applied OHDSI's Common Data Model (CDM) to the population with epilepsy and none have validated relevant concepts.

View Article and Find Full Text PDF
Article Synopsis
  • The study aimed to develop COVID-19 prediction models using influenza data to quickly and accurately assess risks of hospital admission and death in patients diagnosed with COVID-19.
  • The researchers created three COVID-19 Estimated Risk (COVER) scores that quantify risks related to pneumonia and mortality based on historical data and validated them using a large dataset of COVID-19 patients across multiple countries.
  • They found that seven key health predictors, along with age and sex, effectively distinguished which patients were likely to face severe outcomes, achieving strong performance in model validation.
View Article and Find Full Text PDF

Many patients with bipolar disorder (BD) are initially misdiagnosed with major depressive disorder (MDD) and are treated with antidepressants, whose potential iatrogenic effects are widely discussed. It is unknown whether MDD is a comorbidity of BD or its earlier stage, and no consensus exists on individual conversion predictors, delaying BD's timely recognition and treatment. We aimed to build a predictive model of MDD to BD conversion and to validate it across a multi-national network of patient databases using the standardization afforded by the Observational Medical Outcomes Partnership (OMOP) common data model.

View Article and Find Full Text PDF

Multi-center observational studies require recognition and reconciliation of differences in patient representations arising from underlying populations, disparate coding practices and specifics of data capture. This leads to different granularity or detail of concepts representing the clinical facts. For researchers studying certain populations of interest, it is important to ensure that concepts at the right level are used for the definition of these populations.

View Article and Find Full Text PDF
Article Synopsis
  • The study aimed to assess the 30-day outcomes and mortality of patients with autoimmune diseases hospitalized due to COVID-19, comparing them to similar hospital patients with seasonal influenza.
  • Researchers analyzed data from multiple health institutions and found that most patients were older females with significant comorbidities.
  • Results indicated that COVID-19 led to more respiratory complications and higher mortality rates (up to 24.6%) compared to influenza (up to 4.3%).
View Article and Find Full Text PDF
Article Synopsis
  • Routinely collected real-world data (RWD) is essential for understanding and responding to the COVID-19 pandemic, as demonstrated by the CHARYBDIS framework for standardizing and analyzing this data.
  • A descriptive cohort study involving over 4.5 million individuals was conducted across the U.S., Europe, and Asia to examine COVID-19-related health risks and outcomes, with detailed information available on an interactive website.
  • The findings from the CHARYBDIS study serve as benchmarks to enhance our knowledge of COVID-19's progression and management, facilitating timely evaluations of new preventative and therapeutic strategies.
View Article and Find Full Text PDF
Article Synopsis
  • The COVID-19 vulnerability (C-19) index was developed to predict which patients might need hospitalization for pneumonia related to COVID-19 but is at risk of bias and lacks external validation.
  • The study aimed to externally validate the C-19 index using data from various healthcare settings and target populations to determine its predictive capabilities for hospitalization due to pneumonia.
  • Results showed that while the C-19 index performed moderately well in internal validation, its external validation yielded low predictive accuracy across different countries, suggesting that it may underestimate the actual risk of hospitalization.
View Article and Find Full Text PDF
Article Synopsis
  • This study aimed to determine how many COVID-19 patients hospitalized in the U.S. needed procedures like dialysis, tracheostomy, and ECMO.
  • It analyzed data from 842,928 hospitalized COVID-19 patients, revealing that about 4.17% received dialysis, while less than 1% had tracheostomy or ECMO interventions.
  • Findings showed that ECMO was more frequently used in younger males with fewer health issues, while tracheostomy rates were similar across demographics, and dialysis was more common in males and those with chronic kidney disease.
View Article and Find Full Text PDF