A Representativeness-informed Model for Research Record Selection from Electronic Medical Record Systems.

Victor A Borza Ellen Wright Clayton Murat Kantarcioglu Yevgeniy Vorobeychik Bradley A Malin

AMIA Annu Symp Proc

Published: May 2023

Scientific and clinical studies have a long history of bias in recruitment of underprivileged and minority populations. This underrepresentation leads to inaccurate, inapplicable, and non-generalizable results. Electronic medical record (EMR) systems, which now drive much research, often poorly represent these groups. We introduce a method for quantifying representativeness using information theoretic measures and an algorithmic approach to select a more representative record cohort than random selection when resource limitations preclude researchers from reviewing every record in the database. We apply this method to select cohorts of 2,000-20,000 records from a large (2M+ records) EMR database at the Vanderbilt University Medical Center and assess representativeness based on age, ethnicity, race, and gender. Compared to random selection - which will on average mirror the EMR database demographics - we find that a representativeness-informed approach can compose a cohort of records that is approximately 5.8 times more representative.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148281	PMC

Publication Analysis

Top Keywords

electronic medical

medical record

random selection

emr database

record

representativeness-informed model

model record

record selection

selection electronic

record systems

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!