Mitigating Bias in Radiology Machine Learning: 1. Data Handling.

Radiol Artif Intell

Radiology Informatics Laboratory, Department of Radiology, Mayo Clinic, 200 1st St SW, Rochester, MN 55905.

Published: September 2022

Minimizing bias is critical to adoption and implementation of machine learning (ML) in clinical practice. Systematic mathematical biases produce consistent and reproducible differences between the observed and expected performance of ML systems, resulting in suboptimal performance. Such biases can be traced back to various phases of ML development: data handling, model development, and performance evaluation. This report presents 12 suboptimal practices during data handling of an ML study, explains how those practices can lead to biases, and describes what may be done to mitigate them. Authors employ an arbitrary and simplified framework that splits ML data handling into four steps: data collection, data investigation, data splitting, and feature engineering. Examples from the available research literature are provided. A Google Colaboratory Jupyter notebook includes code examples to demonstrate the suboptimal practices and steps to prevent them. Data Handling, Bias, Machine Learning, Deep Learning, Convolutional Neural Network (CNN), Computer-aided Diagnosis (CAD) © RSNA, 2022.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9533091PMC
http://dx.doi.org/10.1148/ryai.210290DOI Listing

Publication Analysis

Top Keywords

data handling
20
machine learning
12
data
8
suboptimal practices
8
handling
5
mitigating bias
4
bias radiology
4
radiology machine
4
learning
4
learning data
4

Similar Publications

This paper introduces a novel methodology for evaluating communication performance in rotating electric machines using Received Signal Strength Indication (RSSI) measurements coupled with artificial intelligence. The proposed approach focuses on assessing the quality of wireless signals in the complex, dynamic environment inside these machines, where factors like reflections, metallic surfaces, and rotational movements can significantly impact communication. RSSI is used as a key parameter to monitor real-time signal behavior, enabling a detailed analysis of communication reliability.

View Article and Find Full Text PDF

CloudSim is a versatile simulation framework for modeling cloud infrastructure components that supports customizable and extensible application provisioning strategies, allowing for the simulation of cloud services. On the other hand, Distributed Acoustic Sensing (DAS) is a ubiquitous technique used for measuring vibrations over an extended region. Data handling in DAS remains an open issue, as many applications need continuous monitoring of a volume of samples whose storage and processing in real time require high-capacity memory and computing resources.

View Article and Find Full Text PDF

Makeup modifies facial textures and colors, impacting the precision of face anti-spoofing systems. Many individuals opt for light makeup in their daily lives, which generally does not hinder face identity recognition. However, current research in face anti-spoofing often neglects the influence of light makeup on facial feature recognition, notably the absence of publicly accessible datasets featuring light makeup faces.

View Article and Find Full Text PDF

Hardware-Assisted Low-Latency NPU Virtualization Method for Multi-Sensor AI Systems.

Sensors (Basel)

December 2024

Department of Semiconductor Systems Engineering, Sejong University, Seoul 05006, Republic of Korea.

Recently, AI systems such as autonomous driving and smart homes have become integral to daily life. Intelligent multi-sensors, once limited to single data types, now process complex text and image data, demanding faster and more accurate processing. While integrating NPUs and sensors has improved processing speed and accuracy, challenges like low resource utilization and long memory latency remain.

View Article and Find Full Text PDF

Generating accurate and contextually rich captions for images and videos is essential for various applications, from assistive technology to content recommendation. However, challenges such as maintaining temporal coherence in videos, reducing noise in large-scale datasets, and enabling real-time captioning remain significant. We introduce MIRA-CAP (Memory-Integrated Retrieval-Augmented Captioning), a novel framework designed to address these issues through three core innovations: a cross-modal memory bank, adaptive dataset pruning, and a streaming decoder.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!