Publications by Casey Ta | LitMetric

Publications by authors named "Casey Ta"

Page 1 of 6

Phenotype driven molecular genetic test recommendation for diagnosing pediatric rare disorders.

Fangyi Chen Priyanka Ahimaz Quan M Nguyen Rachel Lewis Wendy K Chung Casey N Ta

NPJ Digit Med

November 2024

Patients with rare diseases often experience prolonged diagnostic delays. Ordering appropriate genetic tests is crucial yet challenging, especially for general pediatricians without genetic expertise. Recent American College of Medical Genetics (ACMG) guidelines embrace early use of exome sequencing (ES) or genome sequencing (GS) for conditions like congenital anomalies or developmental delays while still recommend gene panels for patients exhibiting strong manifestations of a specific disease.

View Article and Find Full Text PDF

Mini-mental status examination phenotyping for Alzheimer's disease patients using both structured and narrative electronic health record features.

Betina Idnay Gongbo Zhang Fangyi Chen Casey N Ta Matthew W Schelke

J Am Med Inform Assoc

January 2025

Objective: This study aims to automate the prediction of Mini-Mental State Examination (MMSE) scores, a widely adopted standard for cognitive assessment in patients with Alzheimer's disease, using natural language processing (NLP) and machine learning (ML) on structured and unstructured EHR data.

Materials And Methods: We extracted demographic data, diagnoses, medications, and unstructured clinical visit notes from the EHRs. We used Latent Dirichlet Allocation (LDA) for topic modeling and Term-Frequency Inverse Document Frequency (TF-IDF) for n-grams.

View Article and Find Full Text PDF

Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation.

Kayla Schiffer-Kane Cong Liu Tiffany J Callahan Casey Ta Jordan G Nestor

J Biomed Inform

July 2024

Article Synopsis

This study focuses on enhancing data interoperability in precision medicine by aligning the OMOP and Phenopackets data models, which helps in storing and exchanging multimodal patient data for better research outcomes.
The researchers developed a transformation process to map real-world patient data from OMOP to Phenopackets, analyzing the effectiveness of this mapping and addressing gaps between the two models.
Results showed successful mapping for Alzheimer's disease patients with a 10.2% loss due to missing OMOP values, and using semantic filtering greatly improved clinical relevance from 68% to 96% in the mappings.

View Article and Find Full Text PDF

Criteria2Query 3.0: Leveraging generative large language models for clinical trial eligibility query generation.

Jimyung Park Yilu Fang Casey Ta Gongbo Zhang Betina Idnay

J Biomed Inform

June 2024

Article Synopsis

C2Q 3.0 is a new system that uses GPT-4 technology to automate the process of identifying eligible patients for clinical trials by turning trial eligibility texts into database queries.* -
The system's performance was tested through concept extraction from clinical trials, where it scored 0.891 for accuracy, and it found multiple errors in the SQL queries generated, with logic errors being the most frequent.* -
Overall, while C2Q 3.0 showed strong coherence in reasoning, there’s still room for improvement in readability, highlighting the need for further research to enhance the reliability of AI in clinical settings.*

View Article and Find Full Text PDF

Applying unsupervised machine learning approaches to characterize autologous breast reconstruction patient subgroups: an NSQIP analysis of 14,274 patients.

Dylan K Kim George S Corpuz Casey N Ta Chunhua Weng Christine H Rohde

J Plast Reconstr Aesthet Surg

January 2024

Article Synopsis

The study explores autologous breast reconstruction techniques and their varying outcomes using unsupervised machine learning to analyze large patient datasets.
A total of 14,274 female patients were examined, resulting in seven distinct clusters based on their likelihood of postoperative complications, with differing characteristics in each group.
The findings highlight the potential of machine learning to identify patterns in complex medical data, linking patient factors and reconstruction methods to their surgical outcomes.

View Article and Find Full Text PDF

Phenotype-Driven Molecular Genetic Test Recommendation for Diagnosing Pediatric Rare Disorders.

Fangyi Chen Priyanka Ahimaz Kai Wang Wendy K Chung Casey Ta

Res Sq

November 2023

Article Synopsis

Rare disease patients face long diagnostic journeys, often going undiagnosed for years, making the choice of genetic tests critical for timely diagnosis.
A new tool called Phen2Test integrates phenotypic information into the genetic test selection process, helping clinicians with minimal genetic training select appropriate tests like whole-exome or whole-genome sequencing.
The model demonstrated strong performance in evaluating genetic test recommendations, showcasing its potential to improve diagnostic speed and accuracy in pediatric rare diseases within clinical settings.

View Article and Find Full Text PDF

An approach for collaborative development of a federated biomedical knowledge graph-based question-answering system: Question-of-the-Month challenges.

Karamarie Fecho Chris Bizon Tursynay Issabekova Sierra Moxon Anne E Thessen Casey Ta

J Clin Transl Sci

September 2023

Article Synopsis

Knowledge graphs are increasingly used for representing knowledge, but their complexity and the diversity of sources make integration difficult due to semantic incompatibilities.
The Biomedical Translator Consortium has created a question-answering system called the Translator, which aids human reasoning by addressing biomedical queries related to various diseases.
A monthly "Question-of-the-Month Challenge" has been established to explore specific issues like drug-related liver injury and coronavirus, providing insights and identifying technical challenges to enhance the Translator system, while also comparing it to Large Language Models like ChatGPT.

View Article and Find Full Text PDF

Uncovering key clinical trial features influencing recruitment.

Betina Idnay Yilu Fang Alex Butler Joyce Moran Ziran Li Casey Ta Edward Stanley

J Clin Transl Sci

September 2023

Article Synopsis

- The study investigates factors that contribute to successful participant recruitment in randomized clinical trials (RCTs) by analyzing data from 393 completed treatment studies, focusing on accrual percentages as a measure of success.
- A comparison of different predictive models revealed that the CatBoost regressor provided the best performance, identifying government funding and participant compensation as positive recruitment factors, while cancer-focused studies and unconventional recruiting methods were less successful.
- The research concludes by emphasizing the importance of specific recruitment strategies, such as flexible infrastructure and adequate compensation, to enhance participant recruitment in future clinical trials.

View Article and Find Full Text PDF

Principal Investigators' Perceptions on Factors Associated with Successful Recruitment in Clinical Trials.

Betina Idnay Alex Butler Yilu Fang Ziran Li Junghwan Lee Casey Ta

AMIA Jt Summits Transl Sci Proc

June 2023

Article Synopsis

* Factors like trial design and sponsor influence recruitment success, but principal investigators have mixed views on how site involvement and funding affect the process.
* Inefficient recruitment strategies are commonly used, and the study suggests improvements like better staff support and using informatics to boost recruitment efforts.

View Article and Find Full Text PDF

Can Race-sensitive Biomedical Embeddings Improve Healthcare Predictive Models?

Hao Liu Nour Moustafa-Fahmy Casey Ta Chunhua Weng

AMIA Jt Summits Transl Sci Proc

June 2023

This reproducibility study presents an algorithm to weigh in race distribution data of clinical research study samples when training biomedical embeddings. We extracted 12,864 PubMed abstracts published between January 1, 2000 and January 1, 2022 and weighed them based on the race distribution data extracted from their corresponding clinical trials registered on ClinicalTrials.gov.

View Article and Find Full Text PDF

A data-driven approach to optimizing clinical study eligibility criteria.

Yilu Fang Hao Liu Betina Idnay Casey Ta Karen Marder

J Biomed Inform

June 2023

Article Synopsis

- The paper introduces OPTEC (OPTimal Eligibility Criteria), a new model for selecting clinical research eligibility criteria that aims to be feasible, safe, and inclusive, moving beyond traditional expert-centered approaches.
- OPTEC utilizes a Multiple Attribute Decision Making method combined with a greedy algorithm to identify optimal criteria for medical conditions, balancing feasibility, patient safety, and diversity.
- Evaluation of OPTEC in Alzheimer’s disease and pancreatic neoplasm showed its effectiveness in recommending top eligibility criteria combinations, designed as an interactive system to assist clinical researchers in improving study designs.

View Article and Find Full Text PDF

EvidenceMap: a three-level knowledge representation for medical evidence computation and comprehension.

Tian Kang Yingcheng Sun Jae Hyun Kim Casey Ta Adler Perotte Nour Moustafa-Fahmy

J Am Med Inform Assoc

May 2023

Article Synopsis

The study aims to create a structured way to represent medical evidence by developing EvidenceMap, which organizes data from randomized controlled trial (RCT) abstracts.
Two independent annotators reviewed RCT abstracts to train a natural language processing (NLP) system, improving evidence comprehension significantly.
EvidenceMap proved efficient, saving users nearly 52% time in understanding the information, and received high user ratings for its representation of key sections in clinical research.

View Article and Find Full Text PDF

Clinical and temporal characterization of COVID-19 subgroups using patient vector embeddings of electronic health records.

Casey N Ta Jason E Zucker Po-Hsiang Chiu Yilu Fang Karthik Natarajan

J Am Med Inform Assoc

January 2023

Article Synopsis

The study analyzed the electronic health records of 11,313 hospitalized COVID-19 patients to identify subgroups based on demographics and severity of conditions.
Researchers utilized a K-means clustering method that resulted in 20 distinct subgroups, reflecting various levels of disease severity and underlying health issues such as pregnancy and cardiovascular disease.
Notable findings included high mortality rates in older patients and critical disease subgroups, with some patients suffering from severe complications and long hospital stays.

View Article and Find Full Text PDF

OARD: Open annotations for rare diseases and their phenotypes based on real-world data.

Cong Liu Casey N Ta Jim M Havrilla Jordan G Nestor Matthew E Spotnitz

Am J Hum Genet

September 2022

Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations.

View Article and Find Full Text PDF

Towards Better Diagnosis Prediction Using Bidirectional Recurrent Neural Networks.

Junghwan Lee Cong Liu Casey Ta Chunhua Weng

Stud Health Technol Inform

June 2022

Bidirectional recurrent neural networks (RNN) improved performance of various natural language processing tasks and recently have been used for diagnosis prediction. Advantages of general bidirectional RNN, however, are not readily applied to diagnosis prediction task. In this study, we present a simple way to efficiently apply bidirectional RNN for diagnosis prediction without using any additional networks or parameters.

View Article and Find Full Text PDF

Evaluation of Criteria2Query: Towards Augmented Intelligence for Cohort Identification.

Cong Liu Hao Liu Casey Ta James Roger Alex Butler

Stud Health Technol Inform

June 2022

Electronic healthcare records data promises to improve the efficiency of patient eligibility screening, which is an important factor in the success of clinical trials and observational studies. To bridge the sociotechnical gap in cohort identification by end-users, who are clinicians or researchers unfamiliar with underlying EHR databases, we previously developed a natural language query interface named Criteria2Query (C2Q) that automatically transforms free-text eligibility criteria to executable database queries. In this study, we present a comprehensive evaluation of C2Q to generate more actionable insights to inform the design and evaluation of future natural language user interfaces for clinical databases, towards the realization of Augmented Intelligence (AI) for clinical cohort definition via e-screening.

View Article and Find Full Text PDF

A machine learning approach to identifying delirium from electronic health records.

Jae Hyun Kim May Hua Robert A Whittington Junghwan Lee Cong Liu Casey N Ta

JAMIA Open

July 2022

Article Synopsis

Identifying delirium in electronic health records (EHRs) is challenging because of poor assessment and documentation, prompting research to develop a classification model for recognition.
The model uses various patient features, like age, sex, and drug exposures, and was validated using data from two datasets, including the Columbia University Irving Medical Center EHR.
The logistic regression model demonstrated strong performance with an average AUC of 0.874 and a positive predictive value of 0.80, enabling better identification of delirium cases and aiding in the creation of a retrospective cohort.

View Article and Find Full Text PDF

Progress toward a universal biomedical data translator.

Karamarie Fecho Anne E Thessen Sergio E Baranzini Chris Bizon Jennifer J Hadlock Casey Ta Gustavo Glusman

Clin Transl Sci

August 2022

Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science.

View Article and Find Full Text PDF

Risk Factors Associated With SARS-CoV-2 Breakthrough Infections in Fully mRNA-Vaccinated Individuals: Retrospective Analysis.

Cong Liu Junghwan Lee Casey Ta Ali Soroush James R Rogers Karthik Natarajan

JMIR Public Health Surveill

May 2022

Article Synopsis

COVID-19 mRNA vaccines, like Pfizer and Moderna, are effective in preventing symptomatic infections, but cases of breakthrough infections have been observed in fully vaccinated individuals.
This study focuses on identifying risk factors for these breakthrough infections using electronic health records from a New York healthcare system, examining variables such as vaccine brand, demographics, and underlying health conditions.
The results indicated an overall breakthrough infection rate of 0.16, with higher risks associated with males, those vaccinated with Pfizer, and individuals with compromised immune systems or certain health issues like organ transplants and active tumors.

View Article and Find Full Text PDF

Leveraging electronic health record data for clinical trial planning by assessing eligibility criteria's impact on patient count and safety.

James R Rogers Jovana Pavisic Casey N Ta Cong Liu Ali Soroush

J Biomed Inform

March 2022

Article Synopsis

- This study focuses on how varying eligibility criteria for clinical trials can affect the number of eligible patients and their safety, specifically looking at hospitalization risks, using electronic health record (EHR) data.
- It examines three disease areas: relapsed/refractory lymphoma/leukemia, hepatitis C virus, and chronic kidney disease, analyzing how different combinations of criteria impact patient numbers and hospitalization risks.
- The results show that specific combinations of criteria can reduce hospitalization risks without significantly limiting the number of eligible patients, indicating that careful selection of criteria is crucial for trial design.

View Article and Find Full Text PDF

A Retrospective Analysis of COVID-19 mRNA Vaccine Breakthrough Infections - Risk Factors and Vaccine Effectiveness.

Cong Liu Junghwan Lee Casey Ta Ali Soroush James R Rogers Karthik Natarajan

medRxiv

October 2021

Article Synopsis

Little is known about breakthrough COVID-19 infections in vaccinated individuals, prompting a study to identify associated risk factors and reassess vaccine effectiveness against severe outcomes using real-world data from a health center in New York.!
The study utilized electronic health records to analyze the relationship between breakthrough infections and factors like vaccine brand, demographics, and health conditions, employing various statistical methods for accurate assessment.!
Results showed that those vaccinated with Pfizer and males, as well as individuals with compromised immune systems, faced a higher risk of breakthrough infections, although overall vaccinated individuals had a significantly lower infection rate compared to unvaccinated individuals.!

View Article and Find Full Text PDF

The potential role of EHR data in optimizing eligibility criteria definition for cardiovascular outcome trials.

Jae Hyun Kim Alex M Butler Casey N Ta Yingcheng Sun Mathew S Maurer

Int J Med Inform

December 2021

Background: Cardiovascular outcome trials (CVOTs) include patients with high risks for cardiovascular events based on specific inclusion criteria. Little is known about the impact of such inclusion criteria on patient accrual and the incidence rate of cardiovascular events.

Materials And Methods: We evaluated the impact of criteria on the accrual and the number of cardiovascular events in a cohort of 1544 diabetes patients identified from the clinical data warehouse of New York Presbyterian Hospital / Columbia University Irving Medical Center.

View Article and Find Full Text PDF

Columbia Open Health Data for COVID-19 Research: Database Analysis.

Junghwan Lee Jae Hyun Kim Cong Liu George Hripcsak Karthik Natarajan Casey Ta

J Med Internet Res

September 2021

Background: COVID-19 has threatened the health of tens of millions of people all over the world. Massive research efforts have been made in response to the COVID-19 pandemic. Utilization of clinical data can accelerate these research efforts to combat the pandemic since important characteristics of the patients are often found by examining the clinical data.

View Article and Find Full Text PDF

A Framework for Systematic Assessment of Clinical Trial Population Representativeness Using Electronic Health Records Data.

Yingcheng Sun Alex Butler Ibrahim Diallo Jae Hyun Kim Casey Ta

Appl Clin Inform

August 2021

Article Synopsis

Clinical trials are essential for medical evidence, but their results may not represent the general population due to restrictive eligibility criteria.
This study aims to evaluate how well clinical trials represent the general population by using electronic health records (EHR) data during the planning phase.
By analyzing COVID-19 and type 2 diabetes trials in the U.S., researchers found that a significant portion had poor representativeness, indicating a need for better-informed eligibility criteria using EHR data.

View Article and Find Full Text PDF

Severity Prediction for COVID-19 Patients via Recurrent Neural Networks.

Junghwan Lee Casey Ta Jae Hyun Kim Cong Liu Chunhua Weng

AMIA Jt Summits Transl Sci Proc

September 2021

Article Synopsis

The COVID-19 pandemic has significantly impacted global health and healthcare systems, leading to a need for effective patient outcome prediction.
The study introduces a predictive model that uses recurrent neural networks to assess the likelihood of severe outcomes for COVID-19 patients based on their historical electronic health records before hospital admission.
The model showed promising accuracy with a 0.846 area under the curve and aims to facilitate proactive risk management rather than relying on post-diagnosis data.

View Article and Find Full Text PDF