Cohort identification for clinical studies tends to be laborious, time-consuming, and expensive. Developing automated or semi-automated methods for cohort identification is one of the "holy grails" in the field of biomedical informatics. We propose a high-throughput similarity-based cohort identification algorithm by applying numerical abstractions on Electronic Health Records (EHR) data. We implement this algorithm using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which enables sites using this standardized EHR data representation to avail this algorithm with minimum effort for local implementation. We validate its performance for a retrospective cohort identification task on six clinical trials conducted at the Columbia University Medical Center. Our algorithm achieves an average Area Under the Curve (AUC) of 0.966 and an average Precision at 5 of 0.983. This interoperable method promises to achieve efficient cohort identification in EHR databases. We discuss suitable applications of our method and its limitations and propose warranted future work.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5536903PMC
http://dx.doi.org/10.1007/s41666-017-0005-6DOI Listing

Publication Analysis

Top Keywords

cohort identification
24
similarity-based cohort
8
omop common
8
common data
8
data model
8
ehr data
8
cohort
6
identification
6
interoperable similarity-based
4
identification method
4

Similar Publications

This study analyzed targeted sequencing data from 6530 tissue samples from patients with metastatic Chinese colorectal cancer (CRC) to identify low mutation frequency and subgroup-specific driver genes, using three algorithms for overall CRC as well as across different clinicopathological subgroups. We analyzed 425 cancer-related genes, identifying 101 potential driver genes, including 36 novel to CRC. Notably, some genes demonstrated subgroup specificity; for instance, ERBB4 was found as a male-specific driver gene and mutations of ERBB4 only influenced the prognosis of male patients with CRC.

View Article and Find Full Text PDF

: Splenectomy leads to a high rate of remission in chronic primary immune thrombocytopenia (ITP), but its unpredictable long-term positive outcomes and that it is a irreversible surgical approach discourage clinicians and patients. The identification of predictors of response may redefine the timing of splenectomy. In this retrospective, multicentric study we aimed to investigate clinical-histological predictors of splenectomy response in ITP patients and provide an easy-to-use score to predict splenectomy response in ITP.

View Article and Find Full Text PDF

Preclinical Identification of Poorly Controlled COPD: Patients with a Single Moderate Exacerbation Matter Too.

J Clin Med

December 2024

Centro de Salud Malilla, Carrer de Malilla 52D, Quatre Carreres, 46026 Valencia, Spain.

Chronic obstructive pulmonary disease (COPD) remains a critical global health challenge, characterized by high morbidity, mortality, and healthcare costs. Current guidelines may overlook patients who present with only one moderate exacerbation or with frequent short-acting beta-agonist (SABA) use. Building on findings from the Seleida study, this research refines the criteria for poor COPD control to include these patients, aiming to improve early identification of high-risk cases in primary care.

View Article and Find Full Text PDF

Background: Plexiform neurofibromas (PNs) are histologically benign peripheral nerve sheath tumors associated with neurofibromatosis type 1 (NF1) and often lead to significant morbidity due to growth. Management includes watchful waiting, surgery for partial debulking, and, since recently, systemic treatment with MEK inhibitors. However, due to the scarcity of natural history studies, our understanding of the natural progression of PNs to guide clinicians in deciding in whom and when to intervene is scarce.

View Article and Find Full Text PDF

Gastric cancer is a leading cause of cancer-related mortality, particularly in East Asia, with a notable burden in Republic of Korea. This study aimed to construct and develop machine learning models for the prediction of gastric cancer mortality and the identification of risk factors. All data were acquired from the Korean Clinical Data Utilization for Research Excellence by multiple medical centers in South Korea.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!