Identification and verification of four candidate biomarkers for early diagnosis of osteoarthritis by machine learning.

Heliyon

Department of Molecular Orthopaedics, National Center for Orthopaedics, Beijing Research Institute of Traumatology and Orthopaedics, Beijing Jishuitan Hospital, Capital Medical University, Beijing, 100035, China.

Published: August 2024

AI Article Synopsis

  • The study researched potential diagnostic biomarkers for osteoarthritis (OA) using machine learning algorithms on datasets from the Gene Expression Omnibus (GEO).
  • The research identified 251 differentially expressed genes (DEGs) and found a combined model for OA diagnosis, which showed varying effectiveness in different datasets, with notable AUC results.
  • Ultimately, the findings provide insights into OA that could help guide future research into its underlying mechanisms.

Article Abstract

Background: Osteoarthritis (OA) is a common chronic joint disease. This study aimed to investigate possible OA diagnostic biomarkers and to verify their significance in clinical samples.

Methods: We exploited three datasets from the Gene Expression Omnibus (GEO) database, serving as the training set. We first determined differentially expressed genes and screened candidate diagnostic biomarkers by applying three machine learning algorithms (Random Forest, Least Absolute Shrinkage and Selection Operator logistic regression, Support Vector Machine-Recursive Feature Elimination). Another GEO dataset was used as the validation set. The test set consisted of RNA-sequenced peripheral blood samples collected from patients and healthy donors. Blood samples and chondrocytes were collected for quantitative real-time PCR to confirm expression levels. Receiver operating characteristic curves were generated for individual and combined biomarkers.

Results: In total, 251 DEGs were screened, where , and were screened by all three algorithms. The area under the curve (AUC) of various biomarkers in our test set did not reach as high as that in public datasets. exhibited highest AUC of 0.947 in the training set but 0.691 in our test set, while the favorable combined model comprising , , and demonstrated an AUC of 0.986 in the training set, 1.000 in the validation set and 0.836 in our test set.

Conclusion: We identified a combined model for early diagnosis of OA that includes , , and . This finding offers new avenues for further exploration of mechanisms underlying OA.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11328075PMC
http://dx.doi.org/10.1016/j.heliyon.2024.e35121DOI Listing

Publication Analysis

Top Keywords

training set
12
test set
12
early diagnosis
8
machine learning
8
diagnostic biomarkers
8
set
8
validation set
8
blood samples
8
combined model
8
identification verification
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!