Publications by Matthew Spotnitz

Publications by authors named "Matthew Spotnitz"

Page 1 of 2

Closing the gap between open-source and commercial large language models for medical evidence summarization.

Gongbo Zhang Qiao Jin Yiliang Zhou Song Wang Betina R Idnay Matthew E Spotnitz

ArXiv

July 2024

Article Synopsis

Large language models (LLMs) show potential in summarizing medical evidence, but using proprietary models can lead to issues like lack of transparency and reliance on specific vendors.
This study focused on enhancing the performance of open-source LLMs by fine-tuning three models—PRIMERA, LongT5, and Llama-2—using a dataset of 8,161 systematic reviews and summaries.
Fine-tuning resulted in significant performance improvements, with LongT5 performing similarly to GPT-3.5 in certain settings, indicating that smaller models can outperform larger models in specific tasks, like summarizing medical evidence.

View Article and Find Full Text PDF

Closing the gap between open source and commercial large language models for medical evidence summarization.

Gongbo Zhang Qiao Jin Yiliang Zhou Song Wang Betina Idnay Matthew E Spotnitz

NPJ Digit Med

September 2024

Article Synopsis

Large language models (LLMs) show potential in summarizing medical evidence but are often limited by issues such as lack of transparency when using proprietary models.
This study examines the effects of fine-tuning open-source LLMs like PRIMERA, LongT5, and Llama-2 to enhance their performance, using a dataset of systematic reviews and summaries.
Results indicate that fine-tuning improves the performance of open-source models, with LongT5 performing nearly as well as GPT-3.5, and smaller fine-tuned models sometimes outperforming larger models in evaluations.

View Article and Find Full Text PDF

Application of a Data Quality Framework to Ductal Carcinoma In Situ Using Electronic Health Record Data From the Research Program.

Lew Berman Yechiam Ostchega John Giannini Lakshmi Priya Anandan Emily Clark Matthew Spotnitz

JCO Clin Cancer Inform

August 2024

Purpose: The specific aims of this paper are to (1) develop and operationalize an electronic health record (EHR) data quality framework, (2) apply the dimensions of the framework to the phenotype and treatment pathways of ductal carcinoma in situ (DCIS) using Research Program data, and (3) propose and apply a checklist to evaluate the application of the framework.

Methods: We developed a framework of five data quality dimensions (DQD; completeness, concordance, conformance, plausibility, and temporality). Participants signed a consent and Health Insurance Portability and Accountability Act authorization to share EHR data and responded to demographic questions in the Basics questionnaire.

View Article and Find Full Text PDF

Identifying erroneous height and weight values from adult electronic health records in the All of Us research program.

Andrew Guide Lina Sulieman Shawn Garbett Robert M Cronin Matthew Spotnitz

J Biomed Inform

July 2024

Introduction: Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the All of Us Research Program (All of Us).

Methods: We developed reference charts for adult heights and weights that were stratified on participant sex.

View Article and Find Full Text PDF

Criteria2Query 3.0: Leveraging generative large language models for clinical trial eligibility query generation.

Jimyung Park Yilu Fang Casey Ta Gongbo Zhang Betina Idnay Matthew Spotnitz

J Biomed Inform

June 2024

Article Synopsis

C2Q 3.0 is a new system that uses GPT-4 technology to automate the process of identifying eligible patients for clinical trials by turning trial eligibility texts into database queries.* -
The system's performance was tested through concept extraction from clinical trials, where it scored 0.891 for accuracy, and it found multiple errors in the SQL queries generated, with logic errors being the most frequent.* -
Overall, while C2Q 3.0 showed strong coherence in reasoning, there’s still room for improvement in readability, highlighting the need for further research to enhance the reliability of AI in clinical settings.*

View Article and Find Full Text PDF

Patterns of red blood cell utilization: Harnessing electronic health records data from the Information Standard for Blood and Transplant (ISBT) 128 system within the Biologics Effectiveness and Safety (BEST) initiative.

Joyce Obidi Gayathri Sridhar Graça M Dores Barbee Whitaker Carlos H Villa Matthew Spotnitz

Transfusion

June 2024

Background: Current hemovigilance methods generally rely on survey data or administrative claims data utilizing billing and revenue codes, each of which has limitations. We used electronic health records (EHR) linked to blood bank data to comprehensively characterize red blood cell (RBC) utilization patterns and trends in three healthcare systems participating in the U.S.

View Article and Find Full Text PDF

Use of Recommended Neurodiagnostic Evaluation Among Patients With Drug-Resistant Epilepsy.

Matthew Spotnitz Cameron D Ekanayake Anna Ostropolets Guy M McKhann Hyunmi Choi

JAMA Neurol

May 2024

Article Synopsis

Patients with drug-resistant epilepsy (DRE) need thorough neurodiagnostic evaluations, but there are significant delays in referrals and underutilization of surgery, particularly in diverse US settings.
This study seeks to analyze the rates and factors influencing neurodiagnostic evaluations for DRE patients across three different US cohorts using extensive medical data.
The findings reveal low rates of comprehensive evaluations among DRE patients, with only about 4.5% in the Medicaid cohort, 8.0% in the commercial insurance cohort, and 14.3% at Columbia University Medical Center.

View Article and Find Full Text PDF

A Survey of Clinicians' Views of the Utility of Large Language Models.

Matthew Spotnitz Betina Idnay Emily R Gordon Rebecca Shyu Gongbo Zhang

Appl Clin Inform

March 2024

Article Synopsis

Large language models (LLMs) like ChatGPT show potential for various clinical applications, but few healthcare providers have shared their views on their suitability for use in practice.
A survey of 30 practicing clinicians explored their comfort levels with LLMs across 23 tasks; 16 of these tasks received positive feedback from over 50% of respondents, highlighting their strong synthesis skills and efficiency.
While clinicians are supportive of using LLMs, especially in assistive roles, they expressed concerns regarding the accuracy and biases associated with the information generated by these models.

View Article and Find Full Text PDF

Scalable and interpretable alternative to chart review for phenotype evaluation using standardized structured data from electronic health records.

Anna Ostropolets George Hripcsak Syed A Husain Lauren R Richter Matthew Spotnitz

J Am Med Inform Assoc

December 2023

Article Synopsis

Researchers wanted to find out if using data from electronic health records is better than looking at patient charts for studying health conditions.
They created a tool called KEEPER that organizes important health information to help understand these conditions faster and more clearly.
The results showed that KEEPER helps doctors agree on patient diagnoses more often and does it in half the time compared to traditional chart reviews.

View Article and Find Full Text PDF

A metadata framework for computational phenotypes.

Matthew Spotnitz Nripendra Acharya James J Cimino Shawn Murphy Bahram Namjou

JAMIA Open

July 2023

Article Synopsis

The study addresses challenges in selecting computational phenotypes for research by proposing a novel metadata framework to improve retrieval and reuse of these phenotypes.
Twenty active researchers contributed to identifying 39 relevant metadata elements, which were then evaluated through surveys and annotation tasks involving type-2 diabetes mellitus phenotypes.
Results showed over 90% satisfaction with the framework's utility, highlighting effectiveness in phenotype description and validation, while noting challenges in data collection complexity and associated costs.

View Article and Find Full Text PDF

Characteristics and outcomes of COVID-19 patients with COPD from the United States, South Korea, and Europe.

David Moreno-Martos Katia Verhamme Anna Ostropolets Kristin Kostka Talita Duarte-Sales Matthew Spotnitz

Wellcome Open Res

March 2022

Article Synopsis

The study investigates COVID-19 patients with chronic obstructive pulmonary disease (COPD) using data from 13 databases across North America, Europe, and Asia between January and June 2020.
It examines two groups of COVID-19 patients: those diagnosed with COVID-19 and those hospitalized, highlighting the prevalence of COPD among these groups and noting higher comorbidities and mortality rates in hospitalized patients.
Key findings reveal significant variations in COPD prevalence by region, increased risk of severe outcomes like ARDS and sepsis in hospitalized patients, and the need for further research to identify high-risk COPD patients.

View Article and Find Full Text PDF

Identification of patients with drug-resistant epilepsy in electronic medical record data using the Observational Medical Outcomes Partnership Common Data Model.

Victor G Castano Matthew Spotnitz Genna J Waldman Evan F Joiner Hyunmi Choi

Epilepsia

November 2022

Article Synopsis

* Methods: Researchers analyzed data from 600 patients categorized into different epilepsy types based on a manual review of electronic health records, testing various demographic factors and treatment codes for their relationship with DRE.
* Results: Out of 412 epilepsy patients, 15% were identified as having DRE, with the most effective identification method yielding a high specificity but moderate sensitivity, highlighting trade-offs in different classification algorithms.

View Article and Find Full Text PDF

OARD: Open annotations for rare diseases and their phenotypes based on real-world data.

Cong Liu Casey N Ta Jim M Havrilla Jordan G Nestor Matthew E Spotnitz

Am J Hum Genet

September 2022

Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations.

View Article and Find Full Text PDF

Recommendations for achieving interoperable and shareable medical data in the USA.

Ana Szarfman Jonathan G Levine Joseph M Tonning Frank Weichold John C Bloom Matthew Spotnitz

Commun Med (Lond)

July 2022

Easy access to large quantities of accurate health data is required to understand medical and scientific information in real-time; evaluate public health measures before, during, and after times of crisis; and prevent medical errors. Introducing a system in the USA that allows for efficient access to such health data and ensures auditability of data facts, while avoiding data silos, will require fundamental changes in current practices. Here, we recommend the implementation of standardized data collection and transmission systems, universal identifiers for individual patients and end users, a reference standard infrastructure to support calibration and integration of laboratory results from equivalent tests, and modernized working practices.

View Article and Find Full Text PDF

Harmonization of Measurement Codes for Concept-Oriented Lab Data Retrieval.

Matthew Spotnitz Jason Patterson Vojtech Huser Chunhua Weng Karthik Natarajan

Stud Health Technol Inform

June 2022

Measurement concepts are essential to observational healthcare research; however, a lack of concept harmonization limits the quality of research that can be done on multisite research networks. We developed five methods that used a combination of automated, semi-automated and manual approaches for generating measurement concept sets. We validated our concept sets by calculating their frequencies in cohorts from the Columbia University Irving Medical Center (CUIMC) database.

View Article and Find Full Text PDF

Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS.

Kristin Kostka Talita Duarte-Salles Albert Prats-Uribe Anthony G Sena Andrea Pistillo Matthew Spotnitz

Clin Epidemiol

March 2022

Article Synopsis

The study emphasizes the importance of real world data (RWD) for understanding and responding to the COVID-19 pandemic using a standardized approach through the CHARYBDIS framework.
Researchers conducted a retrospective database study across multiple countries, including the US and parts of Europe and Asia, involving over 4.5 million individuals and focusing on their clinical characteristics and outcomes.
Findings reveal higher diagnoses among women but more hospitalizations among men, common comorbidities like diabetes and heart disease, and key symptoms such as cough and fever; this data helps to identify trends in COVID-19 across different populations and time periods.

View Article and Find Full Text PDF

Patient characteristics and antiseizure medication pathways in newly diagnosed epilepsy: Feasibility and pilot results using the common data model in a single-center electronic medical record database.

Matthew Spotnitz Anna Ostropolets Victor G Castano Karthik Natarajan Genna J Waldman

Epilepsy Behav

April 2022

Introduction: Efforts to characterize variability in epilepsy treatment pathways are limited by the large number of possible antiseizure medication (ASM) regimens and sequences, heterogeneity of patients, and challenges of measuring confounding variables and outcomes across institutions. The Observational Health Data Science and Informatics (OHDSI) collaborative is an international data network representing over 1 billion patient records using common data standards. However, few studies have applied OHDSI's Common Data Model (CDM) to the population with epilepsy and none have validated relevant concepts.

View Article and Find Full Text PDF

Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network.

Ross D Williams Aniek F Markus Cynthia Yang Talita Duarte-Salles Scott L DuVall Matthew E Spotnitz

BMC Med Res Methodol

January 2022

Article Synopsis

The study aimed to develop COVID-19 prediction models using influenza data to quickly and accurately assess risks of hospital admission and death in patients diagnosed with COVID-19.
The researchers created three COVID-19 Estimated Risk (COVER) scores that quantify risks related to pneumonia and mortality based on historical data and validated them using a large dataset of COVID-19 patients across multiple countries.
They found that seven key health predictors, along with age and sex, effectively distinguished which patients were likely to face severe outcomes, achieving strong performance in model validation.

View Article and Find Full Text PDF

Predictors of diagnostic transition from major depressive disorder to bipolar disorder: a retrospective observational network study.

Anastasiya Nestsiarovich Jenna M Reps Michael E Matheny Scott L DuVall Kristine E Lynch Matthew Spotnitz

Transl Psychiatry

December 2021

Many patients with bipolar disorder (BD) are initially misdiagnosed with major depressive disorder (MDD) and are treated with antidepressants, whose potential iatrogenic effects are widely discussed. It is unknown whether MDD is a comorbidity of BD or its earlier stage, and no consensus exists on individual conversion predictors, delaying BD's timely recognition and treatment. We aimed to build a predictive model of MDD to BD conversion and to validate it across a multi-national network of patient databases using the standardization afforded by the Observational Medical Outcomes Partnership (OMOP) common data model.

View Article and Find Full Text PDF

Characterizing database granularity using SNOMED-CT hierarchy.

Anna Ostropolets Christian Reich Patrick Ryan Chunhua Weng Anthony Molinaro Matthew E Spotnitz

AMIA Annu Symp Proc

June 2021

Multi-center observational studies require recognition and reconciliation of differences in patient representations arising from underlying populations, disparate coding practices and specifics of data capture. This leads to different granularity or detail of concepts representing the clinical facts. For researchers studying certain populations of interest, it is important to ensure that concepts at the right level are used for the definition of these populations.

View Article and Find Full Text PDF

COVID-19 in patients with autoimmune diseases: characteristics and outcomes in a multinational network of cohorts across three countries.

Eng Hooi Tan Anthony G Sena Albert Prats-Uribe Seng Chan You Waheed-Ul-Rahman Ahmed Matthew Spotnitz

Rheumatology (Oxford)

October 2021

Article Synopsis

The study aimed to assess the 30-day outcomes and mortality of patients with autoimmune diseases hospitalized due to COVID-19, comparing them to similar hospital patients with seasonal influenza.
Researchers analyzed data from multiple health institutions and found that most patients were older females with significant comorbidities.
Results indicated that COVID-19 led to more respiratory complications and higher mortality rates (up to 24.6%) compared to influenza (up to 4.3%).

View Article and Find Full Text PDF

Unraveling COVID-19: a large-scale characterization of 4.5 million COVID-19 cases using CHARYBDIS.

Daniel Prieto-Alhambra Kristin Kostka Talita Duarte-Salles Albert Prats-Uribe Anthony Sena Matthew Spotnitz

Res Sq

March 2021

Article Synopsis

Routinely collected real-world data (RWD) is essential for understanding and responding to the COVID-19 pandemic, as demonstrated by the CHARYBDIS framework for standardizing and analyzing this data.
A descriptive cohort study involving over 4.5 million individuals was conducted across the U.S., Europe, and Asia to examine COVID-19-related health risks and outcomes, with detailed information available on an interactive website.
The findings from the CHARYBDIS study serve as benchmarks to enhance our knowledge of COVID-19's progression and management, facilitating timely evaluations of new preventative and therapeutic strategies.

View Article and Find Full Text PDF

Implementation of the COVID-19 Vulnerability Index Across an International Network of Health Care Data Sets: Collaborative External Validation Study.

Jenna M Reps Chungsoo Kim Ross D Williams Aniek F Markus Cynthia Yang Matthew E Spotnitz

JMIR Med Inform

April 2021

Article Synopsis

The COVID-19 vulnerability (C-19) index was developed to predict which patients might need hospitalization for pneumonia related to COVID-19 but is at risk of bias and lacks external validation.
The study aimed to externally validate the C-19 index using data from various healthcare settings and target populations to determine its predictive capabilities for hospitalization due to pneumonia.
Results showed that while the C-19 index performed moderately well in internal validation, its external validation yielded low predictive accuracy across different countries, suggesting that it may underestimate the actual risk of hospitalization.

View Article and Find Full Text PDF

Use of dialysis, tracheostomy, and extracorporeal membrane oxygenation among 842,928 patients hospitalized with COVID-19 in the United States.

Edward Burn Anthony G Sena Albert Prats-Uribe Matthew Spotnitz Scott DuVall

medRxiv

February 2021

Article Synopsis

This study aimed to determine how many COVID-19 patients hospitalized in the U.S. needed procedures like dialysis, tracheostomy, and ECMO.
It analyzed data from 842,928 hospitalized COVID-19 patients, revealing that about 4.17% received dialysis, while less than 1% had tracheostomy or ECMO interventions.
Findings showed that ECMO was more frequently used in younger males with fewer health issues, while tracheostomy rates were similar across demographics, and dialysis was more common in males and those with chronic kidney disease.

View Article and Find Full Text PDF