An application of machine learning based on real-world data: Mining features of fibrinogen in clinical stages of lung cancer between sexes.

Ann Transl Med

Department of Thoracic Surgery, The Affiliated Cancer Hospital of Nanjing Medical University & Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research, Jiangsu Key Laboratory of Molecular and Translational Cancer Research, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing, China.

Published: April 2021

Background: Lung cancer is the most threatening malignant tumor to human health and life. Using a variety of machine learning algorithms and statistical analyses, this paper explores, discovers and demonstrates new indicators for the early diagnosis of lung cancer and their diagnostic performance from large samples of clinical data in the real world.

Methods: By applying machine learning methods, including minimum description length (MDL), naive Bayesian (NB), K-means (KM), nonnegative matrix factorization (NMF), and decision tree (DT), based on large sample data of 2,502 patients, we built a classification model and systematically explored differences in fibrinogen levels in different clinical stages of lung cancer between the sexes. We also validated the reliability of the model by testing it on a validation cohort of 447 patients. This report adheres to the "Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis" (TRIPOD) statement for the reporting of prediction models.

Results: The analysis revealed significant differences in fibrinogen levels, pleural effusion, chlorine levels, A-G ratio, glutamic-oxaloacetic transaminase and alkaline phosphatase levels as well as in sex composition between the early-stage lung cancer group and the middle-late-stage lung cancer group. The classification model created by the combination of fibrinogen, alkaline phosphatase and sex demonstrated good performance with an AUC of 73.5%. In addition, in males, a fibrinogen level of 2.94 g/L could initially serve as the upper limit for determining the early-stage lung cancer group, but a level of 3.91 g/L could be preliminarily used as a reference threshold for the lower limit for middle- to late-stage lung cancer. This latter level could also serve as the upper limit of the critical value for early-stage lung cancer in females.

Conclusions: An integrated application based on supervised and unsupervised machine learning algorithms could effectively explore the potential links contained in the clinical data and reveal the differences in fibrinogen levels in different clinical stages of lung cancer between the sexes, which could provide a new reference basis for lung cancer staging.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8106088PMC
http://dx.doi.org/10.21037/atm-20-4704DOI Listing

Publication Analysis

Top Keywords

lung cancer
44
machine learning
16
clinical stages
12
stages lung
12
cancer sexes
12
differences fibrinogen
12
fibrinogen levels
12
early-stage lung
12
cancer group
12
lung
11

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!