Publications by Matvey Palchuk | LitMetric

Publications by authors named "Matvey Palchuk"

Page 1 of 2

National COVID Cohort Collaborative Data Enhancements: A Path for Expanding Common Data Models.

Kellie M Walters Marshall Clark Sofia Dard Stephanie S Hong Elizabeth Kelly Matvey B Palchuk

J Am Med Inform Assoc

November 2024

Introduction: To support long COVID research in National COVID Cohort Collaborative (N3C), the N3C Phenotype and Data Acquisition team created data designs to aid contributing sites in enhancing their data. Enhancements include: long COVID specialty clinic indicator; Admission, Discharge, and Transfer (ADT) transactions; patient-level social determinants of health; and in-hospital use of oxygen supplementation.

Methods: For each enhancement, we defined the scope and wrote guidance on how to prepare and populate the data in a standardized way.

View Article and Find Full Text PDF

Enhancing real world data interoperability in healthcare: A methodological approach to laboratory unit harmonization.

Aída Muñoz Monjas David Rubio Ruiz David Pérez Del Rey Matvey B Palchuk

Int J Med Inform

January 2025

Objective: The primary aim of this study is to address the critical issue of non-standardized units in clinical laboratory data, which poses significant challenges to data interoperability and secondary usage. Despite UCUM (Unified Code for Units of Measure) offering a unique representation for laboratory test units, nearly 60% of laboratory codes in healthcare organizations use non-standard units. We sought to design, implement and test a methodology for the harmonization of units to the UCUM standards across a large research network.

View Article and Find Full Text PDF

A Case Demonstration of the Open Health Natural Language Processing Toolkit From the National COVID-19 Cohort Collaborative and the Researching COVID to Enhance Recovery Programs for a Natural Language Processing System for COVID-19 or Postacute Sequelae of SARS CoV-2 Infection: Algorithm Development and Validation.

Andrew Wen Liwei Wang Huan He Sunyang Fu Sijia Liu Matvey B Palchuk

JMIR Med Inform

September 2024

Background: A wealth of clinically relevant information is only obtainable within unstructured clinical narratives, leading to great interest in clinical natural language processing (NLP). While a multitude of approaches to NLP exist, current algorithm development approaches have limitations that can slow the development process. These limitations are exacerbated when the task is emergent, as is the case currently for NLP extraction of signs and symptoms of COVID-19 and postacute sequelae of SARS-CoV-2 infection (PASC).

View Article and Find Full Text PDF

Leveraging a global, federated, real-world data network to optimize investigator-initiated pediatric clinical trials: the TriNetX Pediatric Collaboratory Network.

Jurran L Wilson Marisol Betensky Sharda Udassi Pavithra R Ellison Richard Lilienthal Matvey B Palchuk

JAMIA Open

October 2024

Objective: Clinical research networks facilitate collaborative research, but data sharing remains a common barrier.

Materials And Methods: The TriNetX platform provides real-time access to electronic health record (EHR)-derived, anonymized data from 173 healthcare organizations (HCOs) and tools for queries and analysis. In 2022, 4 pediatric HCOs worked with TriNetX leadership to found the Pediatric Collaboratory Network (PCN), facilitated via a multi-institutional data-use agreement (DUA).

View Article and Find Full Text PDF

Increased Incidence of Vestibular Disorders in Patients With SARS-CoV-2.

Lawrance Lee Evan French Daniel H Coelho Nauman F Manzoor Matvey B Palchuk

Otol Neurotol Open

June 2024

Article Synopsis

A study investigated the prevalence of vestibular disorders in patients with COVID-19 compared to those without the virus using data from the National COVID Cohort Collaborative database.
Results showed that individuals with COVID-19 were significantly more likely to experience vestibular disorders, with the highest risk associated with the omicron 23A variant (OR of 8.80).
The findings underscore the need for further research on the long-term effects of vestibular disorders in COVID-19 patients and implications for patient counseling.

View Article and Find Full Text PDF

A pancreatic cancer risk prediction model (Prism) developed and validated on large-scale US clinical data.

Kai Jia Steven Kundrot Matvey B Palchuk Jeff Warnick Kathryn Haapala

EBioMedicine

December 2023

Background: Pancreatic Duct Adenocarcinoma (PDAC) screening can enable early-stage disease detection and long-term survival. Current guidelines use inherited predisposition, with about 10% of PDAC cases eligible for screening. Using Electronic Health Record (EHR) data from a multi-institutional federated network, we developed and validated a PDAC RISk Model (Prism) for the general US population to extend early PDAC detection.

View Article and Find Full Text PDF

Exploring Breast Cancer Systemic Drug Therapy Patterns in Real-World Data.

Julia O'Rourke Jeff Warnick John Doole Luc De Keyser Zuzanna Drebert Matvey B Palchuk

JCO Clin Cancer Inform

September 2023

Purpose: To explore medications and their administration patterns in real-world patients with breast cancer.

Methods: A retrospective study was performed using TriNetX, a federated network of deidentified, Health Insurance Portability and Accountability Act-compliant data from 21 health care organizations across North America. Patients diagnosed with breast cancer between January 1, 2013, and May 31, 2022, were included.

View Article and Find Full Text PDF

The Detection of Date Shifting in Real-World Data.

Laura Evans Jack W London Matvey B Palchuk

Appl Clin Inform

August 2023

Objectives: Analysis of health care real-world data (RWD) provides an opportunity to observe the actual patient diagnostic, treatment, and outcome events. However, researchers should understand the possible limitations of RWD. In particular, the dates in these data may be shifted from their actual values, which might affect the validity of study conclusions.

View Article and Find Full Text PDF

Automatic Outlier Detection in Laboratory Result Distributions Within a Real World Data Network.

Aída Muñoz Monjas David Rubio Ruiz David Pérez-Rey Matvey Palchuk

Stud Health Technol Inform

May 2023

Laboratory data must be interoperable to be able to accurately compare the results of a lab test between healthcare organizations. To achieve this, terminologies like LOINC (Logical Observation Identifiers, Names and Codes) provide unique identification codes for laboratory tests. Once standardized, the numeric results of laboratory tests can be aggregated and represented in histograms.

View Article and Find Full Text PDF

A global federated real-world data and analytics platform for research.

Matvey B Palchuk Jack W London David Perez-Rey Zuzanna J Drebert Jessamine P Winer-Jones

JAMIA Open

July 2023

Objective: This article describes a scalable, performant, sustainable global network of electronic health record data for biomedical and clinical research.

Materials And Methods: TriNetX has created a technology platform characterized by a conservative security and governance model that facilitates collaboration and cooperation between industry participants, such as pharmaceutical companies and contract research organizations, and academic and community-based healthcare organizations (HCOs). HCOs participate on the network in return for access to a suite of analytics capabilities, large networks of de-identified data, and more sponsored trial opportunities.

View Article and Find Full Text PDF

Applying FHIR Genomics for Research - From Sequencing to Database.

Sean Hernandez Karen Fairchild Mark Pemberton Jonathan Dahmer Wei Zhang Matvey B Palchuk

AMIA Jt Summits Transl Sci Proc

June 2023

The availability of next-generation sequencing (NGS) technologies and their continually declining costs have resulted in the accumulation of large genomic data sets. NGS results have traditionally been delivered in PDF format, and in some cases, structured data, e.g.

View Article and Find Full Text PDF

Advancing Interoperability of Patient-level Social Determinants of Health Data to Support COVID-19 Research.

Jimmy Phuong Stephanie Hong Matvey B Palchuk Juan Espinoza Daniella Meeker

AMIA Jt Summits Transl Sci Proc

June 2023

Including social determinants of health (SDoH) data in health outcomes research is essential for studying the sources of healthcare disparities and developing strategies to mitigate stressors. In this report, we describe a pragmatic design and approach to explore the encoding needs for transmitting SDoH screening tool responses from a large safety-net hospital into the National Covid Cohort Collaborative (N3C) OMOP dataset. We provide a stepwise account of designing data mapping and ingestion for patient-level SDoH and summarize the results of screening.

View Article and Find Full Text PDF

Building an i2b2-Based Population Repository for COVID-19 Research.

Miguel Pedrera-Jimenez Noelia Garcia-Barrio Gema Hernandez-Ibarburu Blanca Baselga Alvar Blanco Matvey Palchuk

Stud Health Technol Inform

May 2022

Reuse of Electronic Health Records (EHRs) for specific diseases such as COVID-19 requires data to be recorded and persisted according to international standards. Since the beginning of the COVID-19 pandemic, Hospital Universitario 12 de Octubre (H12O) evolved its EHRs: it identified, modeled and standardized the concepts related to this new disease in an agile, flexible and staged way. Thus, data from more than 200,000 COVID-19 cases were extracted, transformed, and loaded into an i2b2 repository.

View Article and Find Full Text PDF

Evolving Effect of the COVID-19 Pandemic on Cancer-Related Encounters.

Jack W London Elnara Fazio-Eynullayeva Matvey B Palchuk Christopher McNair

JCO Clin Cancer Inform

February 2022

Purpose: This is an update to a previously published report characterizing the impact that efforts to control the COVID-19 pandemic have had on the normal course of cancer-related encounters.

Methods: Data were analyzed from 22 US health care organizations (members of the TriNetX global network) having relevant, up-to-date encounter data. Although the original study compared encounter data pre-COVID-19 (January-April 2019) with the corresponding months in 2020, this update considers data through April 2021.

View Article and Find Full Text PDF

Ensuring a safe(r) harbor: Excising personally identifiable information from structured electronic health record data.

Emily R Pfaff Melissa A Haendel Kristin Kostka Adam Lee Emily Niehaus Matvey B Palchuk

J Clin Transl Sci

December 2021

Recent findings have shown that the continued expansion of the scope and scale of data collected in electronic health records are making the protection of personally identifiable information (PII) more challenging and may inadvertently put our institutions and patients at risk if not addressed. As clinical terminologies expand to include new terms that may capture PII (e.g.

View Article and Find Full Text PDF

Synergies between centralized and federated approaches to data quality: a report from the national COVID cohort collaborative.

Emily R Pfaff Andrew T Girvin Davera L Gabriel Kristin Kostka Michele Morris Matvey B Palchuk

J Am Med Inform Assoc

March 2022

Objective: In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations.

View Article and Find Full Text PDF

Clinical Characterization and Prediction of Clinical Severity of SARS-CoV-2 Infection Among US Adults Using Data From the US National COVID Cohort Collaborative.

Tellen D Bennett Richard A Moffitt Janos G Hajagos Benjamin Amor Adit Anand Matvey B Palchuk

JAMA Netw Open

July 2021

Article Synopsis

- The National COVID Cohort Collaborative (N3C) is a massive electronic health record database that provides valuable insights into COVID-19, supporting the development of better diagnostic tools and clinical practices.
- This study analyzed data from nearly 2 million adults across 34 medical centers to evaluate the severity of COVID-19 and its risk factors over time, using advanced machine learning techniques to predict severe outcomes.
- Among the 174,568 adults infected with SARS-CoV-2, a significant portion experienced severe illness, highlighting the need for continuous monitoring and adjustment of treatment approaches based on demographic characteristics and disease severity.

View Article and Find Full Text PDF

Assessing real-world medication data completeness.

Laura Evans Jack W London Matvey B Palchuk

J Biomed Inform

July 2021

Objective: Analysis of healthcare Real-World Data (RWD) provides an opportunity to observe actual patient diagnostic, treatment and outcomes events. However, researchers should understand the possible limitations of RWD. In particular, these data may be incomplete, which would affect the validity of study conclusions.

View Article and Find Full Text PDF

The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction.

Tellen D Bennett Richard A Moffitt Janos G Hajagos Benjamin Amor Adit Anand Matvey B Palchuk

medRxiv

January 2021

Article Synopsis

The National COVID Cohort Collaborative (N3C) is the largest U.S. COVID-19 patient database, created to provide a comprehensive analysis of clinical characteristics, disease progression, and treatment outcomes across multiple health centers, enhancing predictive and diagnostic tools for COVID-19.
A study involving over 1.9 million patients from 34 medical centers found significant clinical data, showing that certain factors like age, sex, and underlying conditions affect disease severity, with a notable decrease in mortality rates among hospitalized patients over time.
The N3C dataset was utilized in machine learning models to successfully predict severe outcomes in COVID-19 patients, achieving high accuracy rates and demonstrating the potential of using electronic health

View Article and Find Full Text PDF

Recommendations for patient similarity classes: results of the AMIA 2019 workshop on defining patient similarity.

Nathan D Seligson Jeremy L Warner William S Dalton David Martin Robert S Miller Matvey B Palchuk

J Am Med Inform Assoc

November 2020

Defining patient-to-patient similarity is essential for the development of precision medicine in clinical care and research. Conceptually, the identification of similar patient cohorts appears straightforward; however, universally accepted definitions remain elusive. Simultaneously, an explosion of vendors and published algorithms have emerged and all provide varied levels of functionality in identifying patient similarity categories.

View Article and Find Full Text PDF

The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.

Melissa A Haendel Christopher G Chute Tellen D Bennett David A Eichmann Justin Guinney Matvey B Palchuk

J Am Med Inform Assoc

March 2021

Objective: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization.

View Article and Find Full Text PDF

Effects of the COVID-19 Pandemic on Cancer-Related Patient Encounters.

Jack W London Elnara Fazio-Eynullayeva Matvey B Palchuk Peter Sankey Christopher McNair

JCO Clin Cancer Inform

July 2020

Purpose: While there are studies under way to characterize the direct effects of the COVID-19 pandemic on the care of patients with cancer, there have been few quantitative reports of the impact that efforts to control the pandemic have had on the normal course of cancer diagnosis and treatment encounters.

Methods: We used the TriNetX platform to analyze 20 health care institutions that have relevant, up-to-date encounter data. Using this COVID and Cancer Research Network (CCRN), we compared cancer cohorts identified by querying encounter data pre-COVID (January 2019-April 2019) and current (January 2020-April 2020).

View Article and Find Full Text PDF

Using a Federated Network of Real-World Data to Optimize Clinical Trials Operations.

Umit Topaloglu Matvey B Palchuk

JCO Clin Cancer Inform

December 2018

Clinical trials, whether industry, cooperative group sponsored, or investigator initiated, have an unacceptable rate of failure as a result of the inability to recruit sufficient numbers of patients. Even those trials that are completed often require time-consuming protocol amendments to achieve accrual goals. These inefficiencies in clinical trial research result in increasing costs and prolong the time needed to bring improved treatments to cancer clinical practice.

View Article and Find Full Text PDF

tranSMART: An Open Source Knowledge Management and High Content Data Analytics Platform.

Elisabeth Scheufele Dina Aronzon Robert Coopersmith Michael T McDuffie Manish Kapoor Matvey B Palchuk

AMIA Jt Summits Transl Sci Proc

February 2015

The tranSMART knowledge management and high-content analysis platform is a flexible software framework featuring novel research capabilities. It enables analysis of integrated data for the purposes of hypothesis generation, hypothesis validation, and cohort discovery in translational research. tranSMART bridges the prolific world of basic science and clinical practice data at the point of care by merging multiple types of data from disparate sources into a common environment.

View Article and Find Full Text PDF