Background: Despite decades of pursuing health equity, racial and ethnic disparities persist in healthcare in America. For cancer specifically, one of the leading observed disparities is worse mortality among non-Hispanic Black patients compared to non-Hispanic White patients across the cancer care continuum. These real-world disparities are reflected in the data used to inform the decisions made to alleviate such inequities. Failing to account for inherently biased data underlying these observations could intensify racial cancer disparities and lead to misguided efforts that fail to appropriately address the real causes of health inequity.

Objective: Estimate the racial/ethnic bias of machine learning models in predicting two-year survival and surgery treatment recommendation for non-small cell lung cancer (NSCLC) patients.

Methods: A Cox survival model, and a LOGIT model as well as three other machine learning models for predicting surgery recommendation were trained using SEER data from NSCLC patients diagnosed from 2000-2018. Models were trained with a 70/30 train/test split (both including and excluding race/ethnicity) and evaluated using performance and fairness metrics. The effects of oversampling the training data were also evaluated.

Results: The survival models show disparate impact towards non-Hispanic Black patients regardless of whether race/ethnicity is used as a predictor. The models including race/ethnicity amplified the disparities observed in the data. The exclusion of race/ethnicity as a predictor in the survival and surgery recommendation models improved fairness metrics without degrading model performance. Stratified oversampling strategies reduced disparate impact while reducing the accuracy of the model.

Conclusion: NSCLC disparities are complex and multifaceted. Yet, even when accounting for age and stage at diagnosis, non-Hispanic Black patients with NSCLC are less often recommended to have surgery than non-Hispanic White patients. Machine learning models amplified the racial/ethnic disparities across the cancer care continuum (which are reflected in the data used to make model decisions). Excluding race/ethnicity lowered the bias of the models but did not affect disparate impact. Developing analytical strategies to improve fairness would in turn improve the utility of machine learning approaches analyzing population-based cancer data.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10729-024-09691-6DOI Listing

Publication Analysis

Top Keywords

machine learning
20
non-hispanic black
12
black patients
12
learning models
12
disparate impact
12
disparities
8
non-small cell
8
cell lung
8
lung cancer
8
data
8

Similar Publications

Diabetes Mellitus combined with Mild Cognitive Impairment (DM-MCI) is a high incidence disease among the elderly. Patients with DM-MCI have considerably higher risk of dementia, whose daily self-care and life management (i.e.

View Article and Find Full Text PDF

Urban rail transit systems, represented by subways, have significantly alleviated the traffic pressure brought by urbanization and have addressed issues such as traffic congestion. However, as a commonly used construction method for subway tunnels, shield tunneling inevitably disturbs the surrounding soil, leading to uneven ground surface settlement, which can impact the safety of nearby buildings. Therefore, it is crucial to promptly obtain and predict the ground surface settlement induced by shield tunneling construction to enable safety warnings and evaluations.

View Article and Find Full Text PDF

Optimizing demand response and load balancing in smart EV charging networks using AI integrated blockchain framework.

Sci Rep

December 2024

Department of Theoretical Electrical Engineering and Diagnostics of Electrical Equipment, Institute of Electrodynamics, National Academy of Sciences of Ukraine, Beresteyskiy, 56, Kyiv-57, 03680, Kyiv, Ukraine.

The integration of Electric Vehicles (EVs) into power grids introduces several critical challenges, such as limited scalability, inefficiencies in real-time demand management, and significant data privacy and security vulnerabilities within centralized architectures. Furthermore, the increasing demand for decentralized systems necessitates robust solutions to handle the growing volume of EVs while ensuring grid stability and optimizing energy utilization. To address these challenges, this paper presents the Demand Response and Load Balancing using Artificial intelligence (DR-LB-AI) framework.

View Article and Find Full Text PDF

This research article presents a thorough and all-encompassing examination of predictive models utilized in the estimation of viscosity for ionic liquid solutions. The study focuses on crucial input parameters, namely the type of cation, the type of anion, the temperature (measured in Kelvin), and the concentration of the ionic liquid (expressed in mol%). This study assesses three influential machine learning algorithms that are based on the Decision Tree methodology.

View Article and Find Full Text PDF

Using Fourier Transform Infrared spectroscopy (FTIR), it is possible to show chemical composition of materials and / or profile chemical changes occurring in tissues, cells, and body fluids during onset and progression of diseases. For diagnostic application, the use of blood would be the most appropriate in biospectroscopy studies since, (i) it is easily accessible and, (ii) enables frequent analyses of biochemical changes occurring in pathological states. At present, different studies have investigated potential of serum, plasma and sputum being alternative biofluids for lung cancer detection using FTIR.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!