Background: The overall prognosis of oral cancer remains poor because over half of patients are diagnosed at advanced-stages. Previously reported screening and earlier detection methods for oral cancer still largely rely on health workers' clinical experience and as yet there is no established method. We aimed to develop a rapid, non-invasive, cost-effective, and easy-to-use deep learning approach for identifying oral cavity squamous cell carcinoma (OCSCC) patients using photographic images.

Methods: We developed an automated deep learning algorithm using cascaded convolutional neural networks to detect OCSCC from photographic images. We included all biopsy-proven OCSCC photographs and normal controls of 44,409 clinical images collected from 11 hospitals around China between April 12, 2006, and Nov 25, 2019. We trained the algorithm on a randomly selected part of this dataset (development dataset) and used the rest for testing (internal validation dataset). Additionally, we curated an external validation dataset comprising clinical photographs from six representative journals in the field of dentistry and oral surgery. We also compared the performance of the algorithm with that of seven oral cancer specialists on a clinical validation dataset. We used the pathological reports as gold standard for OCSCC identification. We evaluated the algorithm performance on the internal, external, and clinical validation datasets by calculating the area under the receiver operating characteristic curves (AUCs), accuracy, sensitivity, and specificity with two-sided 95% CIs.

Findings: 1469 intraoral photographic images were used to validate our approach. The deep learning algorithm achieved an AUC of 0·983 (95% CI 0·973-0·991), sensitivity of 94·9% (0·915-0·978), and specificity of 88·7% (0·845-0·926) on the internal validation dataset ( = 401), and an AUC of 0·935 (0·910-0·957), sensitivity of 89·6% (0·847-0·942) and specificity of 80·6% (0·757-0·853) on the external validation dataset ( = 402). For a secondary analysis on the internal validation dataset, the algorithm presented an AUC of 0·995 (0·988-0·999), sensitivity of 97·4% (0·932-1·000) and specificity of 93·5% (0·882-0·979) in detecting early-stage OCSCC. On the clinical validation dataset ( = 666), our algorithm achieved comparable performance to that of the average oral cancer expert in terms of accuracy (92·3% [0·902-0·943] 92.4% [0·912-0·936]), sensitivity (91·0% [0·879-0·941] 91·7% [0·898-0·934]), and specificity (93·5% [0·909-0·960] 93·1% [0·914-0·948]). The algorithm also achieved significantly better performance than that of the average medical student (accuracy of 87·0% [0·855-0·885], sensitivity of 83·1% [0·807-0·854], and specificity of 90·7% [0·889-0·924]) and the average non-medical student (accuracy of 77·2% [0·757-0·787], sensitivity of 76·6% [0·743-0·788], and specificity of 77·9% [0·759-0·797]).

Interpretation: Automated detection of OCSCC by deep-learning-powered algorithm is a rapid, non-invasive, low-cost, and convenient method, which yielded comparable performance to that of human specialists and has the potential to be used as a clinical tool for fast screening, earlier detection, and therapeutic efficacy assessment of the cancer.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7599313PMC
http://dx.doi.org/10.1016/j.eclinm.2020.100558DOI Listing

Publication Analysis

Top Keywords

validation dataset
28
deep learning
16
oral cancer
16
learning algorithm
12
photographic images
12
internal validation
12
clinical validation
12
algorithm achieved
12
algorithm
10
dataset
9

Similar Publications

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

Sci Rep

January 2025

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.

View Article and Find Full Text PDF

Fluid flow across a Riga Plate is a specialized phenomenon studied in boundary layer flow and magnetohydrodynamic (MHD) applications. The Riga Plate is a magnetized surface used to manipulate boundary layer characteristics and control fluid flow properties. Understanding the behavior of fluid flow over a Riga Plate is critical in many applications, including aerodynamics, industrial, and heat transfer operations.

View Article and Find Full Text PDF

A New Global Mangrove Height Map with a 12 meter spatial resolution.

Sci Data

January 2025

ETH Zürich, Institut für Umweltingenieurwissenschaften, Zürich, Switzerland.

Mangrove forests thrive along global tropical coasts, acting as a barrier that protects coastlines against storm surges and as nurseries for an entire food web. They are also known for their high carbon sequestration rates and soil carbon stocks. We introduce a new global mangrove canopy height map generated from TanDEM-X spaceborne elevation measurements collected during the 2011-2013 period with a 12-meter spatial resolution and an accuracy of 2.

View Article and Find Full Text PDF

Objective: Medical laboratory data together with prescribing and hospitalisation records are three of the most used electronic health records (EHRs) for data-driven health research. In Scotland, hospitalisation, prescribing and the death register data are available nationally whereas laboratory data is captured, stored and reported from local health board systems with significant heterogeneity. For researchers or other users of this regionally curated data, working on laboratory datasets across regional cohorts requires effort and time.

View Article and Find Full Text PDF

Objective: This study was to explore the factors associated with prolonged hospital length of stay (LOS) in patients with intracranial aneurysms (IAs) undergoing endovascular interventional embolization and construct prediction model machine learning algorithms.

Methods: Employing a retrospective cohort study design, this study collected patients with ruptured IA who received endovascular treatment at Jingzhou First People's Hospital during the inclusion period from September 2022 to December 2023. The entire dataset was randomly split into training and testing dataset with a 7:3 ratio.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!