Objective: To identify stigmatizing language in obstetric clinical notes using natural language processing (NLP).
Materials And Methods: We analyzed electronic health records from birth admissions in the Northeast United States in 2017. We annotated 1771 clinical notes to generate the initial gold standard dataset.
Objectives: To evaluate the effectiveness of Bayesian Improved Surname Geocoding (BISG) and Bayesian Improved First Name Surname Geocoding (BIFSG) in estimating race and ethnicity, and how they influence odds ratios for preterm birth.
Methods: We analyzed hospital birth admission electronic health records (EHR) data (N = 9985). We created two simulation sets with 40 % of race and ethnicity data missing randomly or more likely for non-Hispanic black birthing people who had preterm birth.