Purpose: Key oncology end points are not routinely encoded into electronic medical records (EMRs). We assessed whether natural language processing (NLP) can abstract treatment discontinuation rationale from unstructured EMR notes to estimate toxicity incidence and progression-free survival (PFS).
Methods: We constructed a retrospective cohort of 6,115 patients with early-stage and 701 patients with metastatic breast cancer initiating care at Memorial Sloan Kettering Cancer Center from 2008 to 2019.