Publications by authors named "Casey Ta"

Patients with rare diseases often experience prolonged diagnostic delays. Ordering appropriate genetic tests is crucial yet challenging, especially for general pediatricians without genetic expertise. Recent American College of Medical Genetics (ACMG) guidelines embrace early use of exome sequencing (ES) or genome sequencing (GS) for conditions like congenital anomalies or developmental delays while still recommend gene panels for patients exhibiting strong manifestations of a specific disease.

View Article and Find Full Text PDF

Objective: This study aims to automate the prediction of Mini-Mental State Examination (MMSE) scores, a widely adopted standard for cognitive assessment in patients with Alzheimer's disease, using natural language processing (NLP) and machine learning (ML) on structured and unstructured EHR data.

Materials And Methods: We extracted demographic data, diagnoses, medications, and unstructured clinical visit notes from the EHRs. We used Latent Dirichlet Allocation (LDA) for topic modeling and Term-Frequency Inverse Document Frequency (TF-IDF) for n-grams.

View Article and Find Full Text PDF
Article Synopsis
  • This study focuses on enhancing data interoperability in precision medicine by aligning the OMOP and Phenopackets data models, which helps in storing and exchanging multimodal patient data for better research outcomes.
  • The researchers developed a transformation process to map real-world patient data from OMOP to Phenopackets, analyzing the effectiveness of this mapping and addressing gaps between the two models.
  • Results showed successful mapping for Alzheimer's disease patients with a 10.2% loss due to missing OMOP values, and using semantic filtering greatly improved clinical relevance from 68% to 96% in the mappings.
View Article and Find Full Text PDF
Article Synopsis
  • C2Q 3.0 is a new system that uses GPT-4 technology to automate the process of identifying eligible patients for clinical trials by turning trial eligibility texts into database queries.* -
  • The system's performance was tested through concept extraction from clinical trials, where it scored 0.891 for accuracy, and it found multiple errors in the SQL queries generated, with logic errors being the most frequent.* -
  • Overall, while C2Q 3.0 showed strong coherence in reasoning, there’s still room for improvement in readability, highlighting the need for further research to enhance the reliability of AI in clinical settings.*
View Article and Find Full Text PDF
Article Synopsis
  • The study explores autologous breast reconstruction techniques and their varying outcomes using unsupervised machine learning to analyze large patient datasets.
  • A total of 14,274 female patients were examined, resulting in seven distinct clusters based on their likelihood of postoperative complications, with differing characteristics in each group.
  • The findings highlight the potential of machine learning to identify patterns in complex medical data, linking patient factors and reconstruction methods to their surgical outcomes.
View Article and Find Full Text PDF
Article Synopsis
  • Rare disease patients face long diagnostic journeys, often going undiagnosed for years, making the choice of genetic tests critical for timely diagnosis.
  • A new tool called Phen2Test integrates phenotypic information into the genetic test selection process, helping clinicians with minimal genetic training select appropriate tests like whole-exome or whole-genome sequencing.
  • The model demonstrated strong performance in evaluating genetic test recommendations, showcasing its potential to improve diagnostic speed and accuracy in pediatric rare diseases within clinical settings.
View Article and Find Full Text PDF
Article Synopsis
  • Knowledge graphs are increasingly used for representing knowledge, but their complexity and the diversity of sources make integration difficult due to semantic incompatibilities.
  • The Biomedical Translator Consortium has created a question-answering system called the Translator, which aids human reasoning by addressing biomedical queries related to various diseases.
  • A monthly "Question-of-the-Month Challenge" has been established to explore specific issues like drug-related liver injury and coronavirus, providing insights and identifying technical challenges to enhance the Translator system, while also comparing it to Large Language Models like ChatGPT.
View Article and Find Full Text PDF
Article Synopsis
  • - The study investigates factors that contribute to successful participant recruitment in randomized clinical trials (RCTs) by analyzing data from 393 completed treatment studies, focusing on accrual percentages as a measure of success.
  • - A comparison of different predictive models revealed that the CatBoost regressor provided the best performance, identifying government funding and participant compensation as positive recruitment factors, while cancer-focused studies and unconventional recruiting methods were less successful.
  • - The research concludes by emphasizing the importance of specific recruitment strategies, such as flexible infrastructure and adequate compensation, to enhance participant recruitment in future clinical trials.
View Article and Find Full Text PDF
Article Synopsis
  • * Factors like trial design and sponsor influence recruitment success, but principal investigators have mixed views on how site involvement and funding affect the process.
  • * Inefficient recruitment strategies are commonly used, and the study suggests improvements like better staff support and using informatics to boost recruitment efforts.
View Article and Find Full Text PDF

This reproducibility study presents an algorithm to weigh in race distribution data of clinical research study samples when training biomedical embeddings. We extracted 12,864 PubMed abstracts published between January 1, 2000 and January 1, 2022 and weighed them based on the race distribution data extracted from their corresponding clinical trials registered on ClinicalTrials.gov.

View Article and Find Full Text PDF
Article Synopsis
  • - The paper introduces OPTEC (OPTimal Eligibility Criteria), a new model for selecting clinical research eligibility criteria that aims to be feasible, safe, and inclusive, moving beyond traditional expert-centered approaches.
  • - OPTEC utilizes a Multiple Attribute Decision Making method combined with a greedy algorithm to identify optimal criteria for medical conditions, balancing feasibility, patient safety, and diversity.
  • - Evaluation of OPTEC in Alzheimer’s disease and pancreatic neoplasm showed its effectiveness in recommending top eligibility criteria combinations, designed as an interactive system to assist clinical researchers in improving study designs.
View Article and Find Full Text PDF
Article Synopsis
  • The study aims to create a structured way to represent medical evidence by developing EvidenceMap, which organizes data from randomized controlled trial (RCT) abstracts.
  • Two independent annotators reviewed RCT abstracts to train a natural language processing (NLP) system, improving evidence comprehension significantly.
  • EvidenceMap proved efficient, saving users nearly 52% time in understanding the information, and received high user ratings for its representation of key sections in clinical research.
View Article and Find Full Text PDF
Article Synopsis
  • The study analyzed the electronic health records of 11,313 hospitalized COVID-19 patients to identify subgroups based on demographics and severity of conditions.
  • Researchers utilized a K-means clustering method that resulted in 20 distinct subgroups, reflecting various levels of disease severity and underlying health issues such as pregnancy and cardiovascular disease.
  • Notable findings included high mortality rates in older patients and critical disease subgroups, with some patients suffering from severe complications and long hospital stays.
View Article and Find Full Text PDF

Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations.

View Article and Find Full Text PDF

Bidirectional recurrent neural networks (RNN) improved performance of various natural language processing tasks and recently have been used for diagnosis prediction. Advantages of general bidirectional RNN, however, are not readily applied to diagnosis prediction task. In this study, we present a simple way to efficiently apply bidirectional RNN for diagnosis prediction without using any additional networks or parameters.

View Article and Find Full Text PDF

Electronic healthcare records data promises to improve the efficiency of patient eligibility screening, which is an important factor in the success of clinical trials and observational studies. To bridge the sociotechnical gap in cohort identification by end-users, who are clinicians or researchers unfamiliar with underlying EHR databases, we previously developed a natural language query interface named Criteria2Query (C2Q) that automatically transforms free-text eligibility criteria to executable database queries. In this study, we present a comprehensive evaluation of C2Q to generate more actionable insights to inform the design and evaluation of future natural language user interfaces for clinical databases, towards the realization of Augmented Intelligence (AI) for clinical cohort definition via e-screening.

View Article and Find Full Text PDF
Article Synopsis
  • Identifying delirium in electronic health records (EHRs) is challenging because of poor assessment and documentation, prompting research to develop a classification model for recognition.
  • The model uses various patient features, like age, sex, and drug exposures, and was validated using data from two datasets, including the Columbia University Irving Medical Center EHR.
  • The logistic regression model demonstrated strong performance with an average AUC of 0.874 and a positive predictive value of 0.80, enabling better identification of delirium cases and aiding in the creation of a retrospective cohort.
View Article and Find Full Text PDF

Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science.

View Article and Find Full Text PDF
Article Synopsis
  • COVID-19 mRNA vaccines, like Pfizer and Moderna, are effective in preventing symptomatic infections, but cases of breakthrough infections have been observed in fully vaccinated individuals.
  • This study focuses on identifying risk factors for these breakthrough infections using electronic health records from a New York healthcare system, examining variables such as vaccine brand, demographics, and underlying health conditions.
  • The results indicated an overall breakthrough infection rate of 0.16, with higher risks associated with males, those vaccinated with Pfizer, and individuals with compromised immune systems or certain health issues like organ transplants and active tumors.
View Article and Find Full Text PDF
Article Synopsis
  • - This study focuses on how varying eligibility criteria for clinical trials can affect the number of eligible patients and their safety, specifically looking at hospitalization risks, using electronic health record (EHR) data.
  • - It examines three disease areas: relapsed/refractory lymphoma/leukemia, hepatitis C virus, and chronic kidney disease, analyzing how different combinations of criteria impact patient numbers and hospitalization risks.
  • - The results show that specific combinations of criteria can reduce hospitalization risks without significantly limiting the number of eligible patients, indicating that careful selection of criteria is crucial for trial design.
View Article and Find Full Text PDF
Article Synopsis
  • Little is known about breakthrough COVID-19 infections in vaccinated individuals, prompting a study to identify associated risk factors and reassess vaccine effectiveness against severe outcomes using real-world data from a health center in New York.!
  • The study utilized electronic health records to analyze the relationship between breakthrough infections and factors like vaccine brand, demographics, and health conditions, employing various statistical methods for accurate assessment.!
  • Results showed that those vaccinated with Pfizer and males, as well as individuals with compromised immune systems, faced a higher risk of breakthrough infections, although overall vaccinated individuals had a significantly lower infection rate compared to unvaccinated individuals.!
View Article and Find Full Text PDF

Background: Cardiovascular outcome trials (CVOTs) include patients with high risks for cardiovascular events based on specific inclusion criteria. Little is known about the impact of such inclusion criteria on patient accrual and the incidence rate of cardiovascular events.

Materials And Methods: We evaluated the impact of criteria on the accrual and the number of cardiovascular events in a cohort of 1544 diabetes patients identified from the clinical data warehouse of New York Presbyterian Hospital / Columbia University Irving Medical Center.

View Article and Find Full Text PDF

Background: COVID-19 has threatened the health of tens of millions of people all over the world. Massive research efforts have been made in response to the COVID-19 pandemic. Utilization of clinical data can accelerate these research efforts to combat the pandemic since important characteristics of the patients are often found by examining the clinical data.

View Article and Find Full Text PDF
Article Synopsis
  • Clinical trials are essential for medical evidence, but their results may not represent the general population due to restrictive eligibility criteria.
  • This study aims to evaluate how well clinical trials represent the general population by using electronic health records (EHR) data during the planning phase.
  • By analyzing COVID-19 and type 2 diabetes trials in the U.S., researchers found that a significant portion had poor representativeness, indicating a need for better-informed eligibility criteria using EHR data.
View Article and Find Full Text PDF
Article Synopsis
  • The COVID-19 pandemic has significantly impacted global health and healthcare systems, leading to a need for effective patient outcome prediction.
  • The study introduces a predictive model that uses recurrent neural networks to assess the likelihood of severe outcomes for COVID-19 patients based on their historical electronic health records before hospital admission.
  • The model showed promising accuracy with a 0.846 area under the curve and aims to facilitate proactive risk management rather than relying on post-diagnosis data.
View Article and Find Full Text PDF