Image pre-processing has significant impact on performance of deep learning models in medicine; yet, there is no standardized method for DICOM pre-processing. In this study, we investigate the impact of two commonly used image preprocessing techniques, histogram equalization (HE) and values-of-interest look-up-table (VOI-LUT) transformations on the performance deep learning classifiers for chest X-rays (CXR). We generated two baseline datasets (raw pixel and standard DICOM processed) from our internal CXR dataset and then enhanced both with HE to create four distinct datasets. Four independent deep learning models for diagnosis of pneumothorax were trained and evaluated on two external datasets. Results reveal that HE enhancement significantly affects model performance, particularly in terms of generalizability. Models trained solely on HE-enhanced datasets exhibit poorer performance on external validation sets, suggesting potential overfitting and information loss. These models also exhibit shortcut learning, relying on spurious correlations in the training data for their prediction. This study highlights the importance of machine learning practitioners being aware of preprocessing techniques applied to datasets and their potential impacts on model performance, as well as need for including preprocessing information when sharing datasets. Additionally, this research underscores the necessity of using pixel values closer to clinical standards during dataset curation to improve model robustness and mitigate the risk of information loss.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10278-025-01418-5DOI Listing

Publication Analysis

Top Keywords

deep learning
12
image preprocessing
8
performance deep
8
learning models
8
preprocessing techniques
8
model performance
8
datasets
6
performance
5
learning
5
dicom lut
4

Similar Publications

Exploring the Role of Immersive Virtual Reality Simulation in Health Professions Education: Thematic Analysis.

JMIR Med Educ

March 2025

Division of Pulmonary, Critical Care, & Sleep Medicine, Department of Medicine, NYU Grossman School of Medicine, 550 First Avenue, 15th Floor, Medical ICU, New York, NY, 10016, United States, 1 2122635800.

Background: Although technology is rapidly advancing in immersive virtual reality (VR) simulation, there is a paucity of literature to guide its implementation into health professions education, and there are no described best practices for the development of this evolving technology.

Objective: We conducted a qualitative study using semistructured interviews with early adopters of immersive VR simulation technology to investigate use and motivations behind using this technology in educational practice, and to identify the educational needs that this technology can address.

Methods: We conducted 16 interviews with VR early adopters.

View Article and Find Full Text PDF

Objectives: To develop a deep learning (DL) model based on ultrasound (US) images of lymph nodes for predicting cervical lymph node metastasis (CLNM) in postoperative patients with differentiated thyroid carcinoma (DTC).

Methods: Retrospective collection of 352 lymph nodes from 330 patients with cytopathology findings between June 2021 and December 2023 at our institution. The database was randomly divided into the training and test cohort at an 8:2 ratio.

View Article and Find Full Text PDF

Brain age gap (BAG), the deviation between estimated brain age and chronological age, is a promising marker of brain health. However, the genetic architecture and reliable targets for brain aging remains poorly understood. In this study, we estimate magnetic resonance imaging (MRI)-based brain age using deep learning models trained on the UK Biobank and validated with three external datasets.

View Article and Find Full Text PDF

There is great interest in using genetically tractable organisms such as to gain insights into the regulation and function of sleep. However, sleep phenotyping in has largely relied on simple measures of locomotor inactivity. Here, we present FlyVISTA, a machine learning platform to perform deep phenotyping of sleep in flies.

View Article and Find Full Text PDF

We use a combination of Brownian dynamics (BD) simulation results and deep learning (DL) strategies for the rapid identification of large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). We used ∼6500 IDP sequences from MobiDB database of length 20-300 to obtain gyration radii from BD simulation on a coarse-grained single-bead amino acid model (HPS2 model) used by us and others [Dignon, G. L.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!