Background And Purpose: Studies investigating the application of Artificial Intelligence (AI) in the field of radiotherapy exhibit substantial variations in terms of quality. The goal of this study was to assess the amount of transparency and bias in scoring articles with a specific focus on AI based segmentation and treatment planning, using modified PROBAST and TRIPOD checklists, in order to provide recommendations for future guideline developers and reviewers.

Materials And Methods: The TRIPOD and PROBAST checklist items were discussed and modified using a Delphi process. After consensus was reached, 2 groups of 3 co-authors scored 2 articles to evaluate usability and further optimize the adapted checklists. Finally, 10 articles were scored by all co-authors. Fleiss' kappa was calculated to assess the reliability of agreement between observers.

Results: Three of the 37 TRIPOD items and 5 of the 32 PROBAST items were deemed irrelevant. General terminology in the items (e.g., multivariable prediction model, predictors) was modified to align with AI-specific terms. After the first scoring round, further improvements of the items were formulated, e.g., by preventing the use of sub-questions or subjective words and adding clarifications on how to score an item. Using the final consensus list to score the 10 articles, only 2 out of the 61 items resulted in a statistically significant kappa of 0.4 or more demonstrating substantial agreement. For 41 items no statistically significant kappa was obtained indicating that the level of agreement among multiple observers is due to chance alone.

Conclusion: Our study showed low reliability scores with the adapted TRIPOD and PROBAST checklists. Although such checklists have shown great value during development and reporting, this raises concerns about the applicability of such checklists to objectively score scientific articles for AI applications. When developing or revising guidelines, it is essential to consider their applicability to score articles without introducing bias.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.radonc.2024.110196DOI Listing

Publication Analysis

Top Keywords

tripod probast
12
bias scoring
8
score articles
8
items statistically
8
statistically kappa
8
items
7
articles
6
tripod
5
probast
5
checklists
5

Similar Publications

Study Question: How accurately can artificial intelligence (AI) models predict sperm retrieval in non-obstructive azoospermia (NOA) patients undergoing micro-testicular sperm extraction (m-TESE) surgery?

Summary Answer: AI predictive models hold significant promise in predicting successful sperm retrieval in NOA patients undergoing m-TESE, although limitations regarding variability of study designs, small sample sizes, and a lack of validation studies restrict the overall generalizability of studies in this area.

What Is Known Already: Previous studies have explored various predictors of successful sperm retrieval in m-TESE, including clinical and hormonal factors. However, no consistent predictive model has yet been established.

View Article and Find Full Text PDF

Health related quality of life (HRQoL) is increasingly assessed in oncology research and routine care, which has led to the inclusion of HRQoL in prediction models. This review aims to describe the current state of oncological prediction models incorporating HRQoL. A systematic literature search for the inclusion of HRQoL in prediction models in oncology was conducted.

View Article and Find Full Text PDF

Background: Peripheral artery disease (PAD) patients after endovascular treatment (EVT) have a relatively high restenosis rate. However, this risk can be mitigated through precise risk assessment and individualized self-management intervention plans. Moreover, the number of predictive models for restenosis risk in PAD patients after EVT is gradually increasing, yet these results of study exhibit certain discrepancies, raising uncertainties regarding the quality and applicability in clinical practice and future research.

View Article and Find Full Text PDF
Article Synopsis
  • This study reviews prediction models for identifying individuals at risk of transitioning to psychosis, particularly focusing on factors that can be easily assessed in standard clinical settings.
  • It identified 48 prediction models and determined that age, gender, global functioning score, trait vulnerability, and unusual thought content are key predictors for transition to psychosis.
  • Four of the models were rated as having low risk of bias, indicating they may be reliable, whereas many other studies showed limitations in their design or power.
View Article and Find Full Text PDF

Machine Learning for Prediction of Postoperative Delirium in Adult Patients: A Systematic Review and Meta-analysis.

Clin Ther

December 2024

Department of Anesthesiology, Hebei General Hospital, Shijiazhuang, Hebei Province, China. Electronic address:

Article Synopsis
  • This meta-analysis assessed how well machine learning (ML) models predict postoperative delirium (POD) in adults and offered insights for their clinical use.
  • Researchers searched multiple databases for studies on ML models predicting POD and used specific tools to evaluate the data quality and potential biases.
  • The analysis revealed that ensemble models generally outperformed single models, with the AdaBoost algorithm showing the best predictive capability, but concerns about the models' generalizability remained due to limited external validation.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!