Publications by authors named "Casey Ta"

Patients with rare diseases often experience prolonged diagnostic delays. Ordering appropriate genetic tests is crucial yet challenging, especially for general pediatricians without genetic expertise. Recent American College of Medical Genetics (ACMG) guidelines embrace early use of exome sequencing (ES) or genome sequencing (GS) for conditions like congenital anomalies or developmental delays while still recommend gene panels for patients exhibiting strong manifestations of a specific disease.

View Article and Find Full Text PDF

Objective: This study aims to automate the prediction of Mini-Mental State Examination (MMSE) scores, a widely adopted standard for cognitive assessment in patients with Alzheimer's disease, using natural language processing (NLP) and machine learning (ML) on structured and unstructured EHR data.

Materials And Methods: We extracted demographic data, diagnoses, medications, and unstructured clinical visit notes from the EHRs. We used Latent Dirichlet Allocation (LDA) for topic modeling and Term-Frequency Inverse Document Frequency (TF-IDF) for n-grams.

View Article and Find Full Text PDF
Article Synopsis
  • This study focuses on enhancing data interoperability in precision medicine by aligning the OMOP and Phenopackets data models, which helps in storing and exchanging multimodal patient data for better research outcomes.
  • The researchers developed a transformation process to map real-world patient data from OMOP to Phenopackets, analyzing the effectiveness of this mapping and addressing gaps between the two models.
  • Results showed successful mapping for Alzheimer's disease patients with a 10.2% loss due to missing OMOP values, and using semantic filtering greatly improved clinical relevance from 68% to 96% in the mappings.
View Article and Find Full Text PDF
Article Synopsis
  • C2Q 3.0 is a new system that uses GPT-4 technology to automate the process of identifying eligible patients for clinical trials by turning trial eligibility texts into database queries.* -
  • The system's performance was tested through concept extraction from clinical trials, where it scored 0.891 for accuracy, and it found multiple errors in the SQL queries generated, with logic errors being the most frequent.* -
  • Overall, while C2Q 3.0 showed strong coherence in reasoning, there’s still room for improvement in readability, highlighting the need for further research to enhance the reliability of AI in clinical settings.*
View Article and Find Full Text PDF

Background: Autologous breast reconstruction is composed of diverse techniques and results in a variety of outcome trajectories. We propose employing an unsupervised machine learning method to characterize such heterogeneous patterns in large-scale datasets.

Methods: A retrospective cohort study of autologous breast reconstruction patients was conducted through the National Surgical Quality Improvement Program database.

View Article and Find Full Text PDF

Rare disease patients often endure prolonged diagnostic odysseys and may still remain undiagnosed for years. Selecting the appropriate genetic tests is crucial to lead to timely diagnosis. Phenotypic features offer great potential for aiding genomic diagnosis in rare disease cases.

View Article and Find Full Text PDF
Article Synopsis
  • Knowledge graphs are increasingly used for representing knowledge, but their complexity and the diversity of sources make integration difficult due to semantic incompatibilities.
  • The Biomedical Translator Consortium has created a question-answering system called the Translator, which aids human reasoning by addressing biomedical queries related to various diseases.
  • A monthly "Question-of-the-Month Challenge" has been established to explore specific issues like drug-related liver injury and coronavirus, providing insights and identifying technical challenges to enhance the Translator system, while also comparing it to Large Language Models like ChatGPT.
View Article and Find Full Text PDF
Article Synopsis
  • - The study investigates factors that contribute to successful participant recruitment in randomized clinical trials (RCTs) by analyzing data from 393 completed treatment studies, focusing on accrual percentages as a measure of success.
  • - A comparison of different predictive models revealed that the CatBoost regressor provided the best performance, identifying government funding and participant compensation as positive recruitment factors, while cancer-focused studies and unconventional recruiting methods were less successful.
  • - The research concludes by emphasizing the importance of specific recruitment strategies, such as flexible infrastructure and adequate compensation, to enhance participant recruitment in future clinical trials.
View Article and Find Full Text PDF
Article Synopsis
  • * Factors like trial design and sponsor influence recruitment success, but principal investigators have mixed views on how site involvement and funding affect the process.
  • * Inefficient recruitment strategies are commonly used, and the study suggests improvements like better staff support and using informatics to boost recruitment efforts.
View Article and Find Full Text PDF

This reproducibility study presents an algorithm to weigh in race distribution data of clinical research study samples when training biomedical embeddings. We extracted 12,864 PubMed abstracts published between January 1, 2000 and January 1, 2022 and weighed them based on the race distribution data extracted from their corresponding clinical trials registered on ClinicalTrials.gov.

View Article and Find Full Text PDF
Article Synopsis
  • - The paper introduces OPTEC (OPTimal Eligibility Criteria), a new model for selecting clinical research eligibility criteria that aims to be feasible, safe, and inclusive, moving beyond traditional expert-centered approaches.
  • - OPTEC utilizes a Multiple Attribute Decision Making method combined with a greedy algorithm to identify optimal criteria for medical conditions, balancing feasibility, patient safety, and diversity.
  • - Evaluation of OPTEC in Alzheimer’s disease and pancreatic neoplasm showed its effectiveness in recommending top eligibility criteria combinations, designed as an interactive system to assist clinical researchers in improving study designs.
View Article and Find Full Text PDF

Objective: To develop a computable representation for medical evidence and to contribute a gold standard dataset of annotated randomized controlled trial (RCT) abstracts, along with a natural language processing (NLP) pipeline for transforming free-text RCT evidence in PubMed into the structured representation.

Materials And Methods: Our representation, EvidenceMap, consists of 3 levels of abstraction: Medical Evidence Entity, Proposition and Map, to represent the hierarchical structure of medical evidence composition. Randomly selected RCT abstracts were annotated following EvidenceMap based on the consensus of 2 independent annotators to train an NLP pipeline.

View Article and Find Full Text PDF

Objective: To identify and characterize clinical subgroups of hospitalized Coronavirus Disease 2019 (COVID-19) patients.

Materials And Methods: Electronic health records of hospitalized COVID-19 patients at NewYork-Presbyterian/Columbia University Irving Medical Center were temporally sequenced and transformed into patient vector representations using Paragraph Vector models. K-means clustering was performed to identify subgroups.

View Article and Find Full Text PDF

Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations.

View Article and Find Full Text PDF

Bidirectional recurrent neural networks (RNN) improved performance of various natural language processing tasks and recently have been used for diagnosis prediction. Advantages of general bidirectional RNN, however, are not readily applied to diagnosis prediction task. In this study, we present a simple way to efficiently apply bidirectional RNN for diagnosis prediction without using any additional networks or parameters.

View Article and Find Full Text PDF

Electronic healthcare records data promises to improve the efficiency of patient eligibility screening, which is an important factor in the success of clinical trials and observational studies. To bridge the sociotechnical gap in cohort identification by end-users, who are clinicians or researchers unfamiliar with underlying EHR databases, we previously developed a natural language query interface named Criteria2Query (C2Q) that automatically transforms free-text eligibility criteria to executable database queries. In this study, we present a comprehensive evaluation of C2Q to generate more actionable insights to inform the design and evaluation of future natural language user interfaces for clinical databases, towards the realization of Augmented Intelligence (AI) for clinical cohort definition via e-screening.

View Article and Find Full Text PDF
Article Synopsis
  • Identifying delirium in electronic health records (EHRs) is challenging because of poor assessment and documentation, prompting research to develop a classification model for recognition.
  • The model uses various patient features, like age, sex, and drug exposures, and was validated using data from two datasets, including the Columbia University Irving Medical Center EHR.
  • The logistic regression model demonstrated strong performance with an average AUC of 0.874 and a positive predictive value of 0.80, enabling better identification of delirium cases and aiding in the creation of a retrospective cohort.
View Article and Find Full Text PDF

Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science.

View Article and Find Full Text PDF
Article Synopsis
  • COVID-19 mRNA vaccines, like Pfizer and Moderna, are effective in preventing symptomatic infections, but cases of breakthrough infections have been observed in fully vaccinated individuals.
  • This study focuses on identifying risk factors for these breakthrough infections using electronic health records from a New York healthcare system, examining variables such as vaccine brand, demographics, and underlying health conditions.
  • The results indicated an overall breakthrough infection rate of 0.16, with higher risks associated with males, those vaccinated with Pfizer, and individuals with compromised immune systems or certain health issues like organ transplants and active tumors.
View Article and Find Full Text PDF
Article Synopsis
  • - This study focuses on how varying eligibility criteria for clinical trials can affect the number of eligible patients and their safety, specifically looking at hospitalization risks, using electronic health record (EHR) data.
  • - It examines three disease areas: relapsed/refractory lymphoma/leukemia, hepatitis C virus, and chronic kidney disease, analyzing how different combinations of criteria impact patient numbers and hospitalization risks.
  • - The results show that specific combinations of criteria can reduce hospitalization risks without significantly limiting the number of eligible patients, indicating that careful selection of criteria is crucial for trial design.
View Article and Find Full Text PDF
Article Synopsis
  • Little is known about breakthrough COVID-19 infections in vaccinated individuals, prompting a study to identify associated risk factors and reassess vaccine effectiveness against severe outcomes using real-world data from a health center in New York.!
  • The study utilized electronic health records to analyze the relationship between breakthrough infections and factors like vaccine brand, demographics, and health conditions, employing various statistical methods for accurate assessment.!
  • Results showed that those vaccinated with Pfizer and males, as well as individuals with compromised immune systems, faced a higher risk of breakthrough infections, although overall vaccinated individuals had a significantly lower infection rate compared to unvaccinated individuals.!
View Article and Find Full Text PDF

Background: Cardiovascular outcome trials (CVOTs) include patients with high risks for cardiovascular events based on specific inclusion criteria. Little is known about the impact of such inclusion criteria on patient accrual and the incidence rate of cardiovascular events.

Materials And Methods: We evaluated the impact of criteria on the accrual and the number of cardiovascular events in a cohort of 1544 diabetes patients identified from the clinical data warehouse of New York Presbyterian Hospital / Columbia University Irving Medical Center.

View Article and Find Full Text PDF

Background: COVID-19 has threatened the health of tens of millions of people all over the world. Massive research efforts have been made in response to the COVID-19 pandemic. Utilization of clinical data can accelerate these research efforts to combat the pandemic since important characteristics of the patients are often found by examining the clinical data.

View Article and Find Full Text PDF

Background: Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population.

Objectives: This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage.

View Article and Find Full Text PDF

The novel coronavirus disease-2019 (COVID-19) pandemic has threatened the health of tens of millions of people worldwide and imposed heavy burden on global healthcare systems. In this paper, we propose a model to predict whether a patient infected with COVID-19 will develop severe outcomes based only on the patient's historical electronic health records (EHR) prior to hospital admission using recurrent neural networks. The model predicts risk score that represents the probability for a patient to progress into severe status (mechanical ventilation, tracheostomy, or death) after being infected with COVID-19.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessiong2fg4j1mpkb0tpdupf9h1rjhjbcg24mm): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once