Investigation of bias in the automated assessment of school violence.

Lara J Kanbar Anagh Mishra Alexander Osborn Andrew Cifuentes Jennifer Combs Michael Sorter Drew Barzman Judith W Dexheimer

J Biomed Inform

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH, USA.

Published: September 2024

The study developed a machine learning algorithm called Automated RIsk Assessment (ARIA) to evaluate the risk of violence in adolescents by analyzing their interview transcripts, addressing potential biases in predictions.
Researchers recruited 412 students aged 10-18 from schools across Ohio, Kentucky, Indiana, and Tennessee, using a forensic psychiatrist's assessment as a reference for risk levels.
ARIA demonstrated strong predictive performance with an AUC of 0.92, but analysis showed low coefficients of determination for demographic factors, suggesting limited influence on predictions despite a significant accuracy overall.

Objectives: Natural language processing and machine learning have the potential to lead to biased predictions. We designed a novel Automated RIsk Assessment (ARIA) machine learning algorithm that assesses risk of violence and aggression in adolescents using natural language processing of transcribed student interviews. This work evaluated the possible sources of bias in the study design and the algorithm, tested how much of a prediction was explained by demographic covariates, and investigated the misclassifications based on demographic variables.

Methods: We recruited students 10-18 years of age and enrolled in middle or high schools in Ohio, Kentucky, Indiana, and Tennessee. The reference standard outcome was determined by a forensic psychiatrist as either a "high" or "low" risk level. ARIA used L2-regularized logistic regression to predict a risk level for each student using contextual and semantic features. We conducted three analyses: a PROBAST analysis of risk in study design; analysis of demographic variables as covariates; and a prediction analysis. Covariates were included in the linear regression analyses and comprised of race, sex, ethnicity, household education, annual household income, age at the time of visit, and utilization of public assistance.

Results: We recruited 412 students from 204 schools. ARIA performed with an AUC of 0.92, sensitivity of 71%, NPV of 77%, and specificity of 95%. Of these, 387 students with complete demographic information were included in the analysis. Individual linear regressions resulted in a coefficient of determination less than 0.08 across all demographic variables. When using all demographic variables to predict ARIA's risk assessment score, the multiple linear regression model resulted in a coefficient of determination of 0.189. ARIA performed with a lower False Negative Rate (FNR) of 15.2% (CI [0 - 40]) for the Black subgroup and 12.7%, CI [0 - 41.4] for Other races, compared to an FNR of 26.1% (CI [14.1 - 41.8]) in the White subgroup.

Conclusions: Bias assessment is needed to address shortcomings within machine learning. In our work, student race, ethnicity, sex, use of public assistance, and annual household income did not explain ARIA's risk assessment score of students. ARIA will continue to be evaluated regularly with increased subject recruitment.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11431206	PMC
http://dx.doi.org/10.1016/j.jbi.2024.104709	DOI Listing

Publication Analysis

Top Keywords

machine learning

risk assessment

demographic variables

natural language

language processing

study design

risk level

linear regression

annual household

household income

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!