Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis.

AMIA Annu Symp Proc

Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA.

Published: February 2013

Electronic Health Records (EHRs) provide a real-world patient cohort for clinical and genomic research. Phenotype identification using informatics algorithms has been shown to replicate known genetic associations found in clinical trials and observational cohorts. However, development of accurate phenotype identification methods can be challenging, requiring significant time and effort. We applied Support Vector Machines (SVMs) to both naïve (i.e., non-curated) and expert-defined collections of EHR features to identify Rheumatoid Arthritis cases using billing codes, medication exposures, and natural language processing-derived concepts. SVMs trained on naïve and expert-defined data outperformed an existing deterministic algorithm; the best performing naïve system had precision of 0.94 and recall of 0.87, compared to precision of 0.75 and recall of 0.51 for the deterministic algorithm. We show that with an expert defined feature set as few as 50-100 training samples are required. This study demonstrates that SVMs operating on non-curated sets of attributes can accurately identify cases from an EHR.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3243261PMC

Publication Analysis

Top Keywords

phenotype identification
12
electronic health
8
rheumatoid arthritis
8
deterministic algorithm
8
naïve
4
naïve electronic
4
health record
4
record phenotype
4
identification rheumatoid
4
arthritis electronic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!