Background: Deep learning models applied to healthcare applications including digital pathology have been increasing their scope and importance in recent years. Many of these models have been trained on The Cancer Genome Atlas (TCGA) atlas of digital images, or use it as a validation source. One crucial factor that seems to have been widely ignored is the internal bias that originates from the institutions that contributed WSIs to the TCGA dataset, and its effects on models trained on this dataset.

Methods: 8,579 paraffin-embedded, hematoxylin and eosin stained, digital slides were selected from the TCGA dataset. More than 140 medical institutions (acquisition sites) contributed to this dataset. Two deep neural networks (DenseNet121 and KimiaNet were used to extract deep features at 20× magnification. DenseNet was pre-trained on non-medical objects. KimiaNet has the same structure but trained for cancer type classification on TCGA images. The extracted deep features were later used to detect each slide's acquisition site, and also for slide representation in image search.

Results: DenseNet's deep features could distinguish acquisition sites with 70% accuracy whereas KimiaNet's deep features could reveal acquisition sites with more than 86% accuracy. These findings suggest that there are acquisition site specific patterns that could be picked up by deep neural networks. It has also been shown that these medically irrelevant patterns can interfere with other applications of deep learning in digital pathology, namely image search. This study shows that there are acquisition site specific patterns that can be used to identify tissue acquisition sites without any explicit training. Furthermore, it was observed that a model trained for cancer subtype classification has exploited such medically irrelevant patterns to classify cancer types. Digital scanner configuration and noise, tissue stain variation and artifacts, and source site patient demographics are among factors that likely account for the observed bias. Therefore, researchers should be cautious of such bias when using histopathology datasets for developing and training deep networks.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189924PMC
http://dx.doi.org/10.1186/s13000-023-01355-3DOI Listing

Publication Analysis

Top Keywords

acquisition site
16
acquisition sites
16
deep features
16
trained cancer
12
deep
10
deep networks
8
acquisition
8
tcga images
8
deep learning
8
digital pathology
8

Similar Publications

Background: Persisting post-concussion symptoms (PPCS) is a condition characterized by prolonged recovery from a mild traumatic brain injury (mTBI) and compromised quality of life. Previous literature, on the basis of small sample sizes, concludes that there are several risk factors for the development of PPCS.

Objective: We seek to identify protective and risk factors for developing slow recovery or persisting post-concussion symptoms (PPCS) by analyzing medical history, contact sport level, setting, and the Sport Concussion Assessment Tool (SCAT) and Brief Symptom Inventory (BSI-18) assessments at baseline and post-injury.

View Article and Find Full Text PDF

Background: Identification of genetic alleles associated with both Alzheimer's disease (AD) and concussion severity/recovery could help explain the association between concussion and elevated dementia risk. However, there has been little investigation into whether AD risk genes associate with concussion severity/recovery, and the limited findings are mixed.

Objective: We used AD polygenic risk scores (PRS) and APOE genotypes to investigate any such associations in the NCAA-DoD Grand Alliance CARE Consortium (CARE) dataset.

View Article and Find Full Text PDF

The Antibody Mediated Prevention (AMP) trials showed that passively infused VRC01, a broadly neutralizing antibody (bNAb) targeting the CD4 binding site (CD4bs) on the HIV-1 envelope protein (Env), protected against neutralization-sensitive viruses. We identified six individuals from the VRC01 treatment arm with multi-lineage breakthrough HIV-1 infections from HVTN703, where one variant was sensitive to VRC01 (IC < 25 ug/mL) but another was resistant. By comparing Env sequences of resistant and sensitive clones from each participant, we identified sites predicted to affect VRC01 neutralization and assessed the effect of their reversion in the VRC01-resistant clone on neutralization sensitivity.

View Article and Find Full Text PDF

Screening a library of temperature-sensitive mutants to identify secretion factors in .

J Bacteriol

January 2025

Department of Microbiology, Howard Taylor Ricketts Laboratory, The University of Chicago, Chicago, Illinois, USA.

Protein secretion is an essential cell process in bacteria, required for cell envelope biogenesis, export of virulence factors, and acquisition of nutrients, among other important functions. In the Sec secretion pathway, signal peptide-bearing precursors are recognized by the SecA ATPase and pushed across the membrane through a translocon channel made of the proteins SecY, SecE, and SecG. The Sec pathway has been extensively studied in the model organism , but the Sec pathways of other bacteria such as the human pathogen differ in important ways from this model.

View Article and Find Full Text PDF

Magnetofluidic-Assisted Portable Automated Microfluidic Devices for Protein Detection.

Anal Chem

January 2025

State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Key Laboratory for Bio-Nanotechnology and Molecular Engineering of Hunan Province, Hunan University, Changsha 410082, China.

To facilitate on-site detection by nonspecialists, there is a demand for the development of portable "sample-to-answer" devices capable of executing all procedures in an automated or easy-to-operate manner. Here, we developed an automated detection device that integrated a magnetofluidic manipulation system and a signal acquisition system. Both systems were controllable via a smartphone.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!