Background: Having a sufficient sample size is crucial when developing a clinical prediction model. We reviewed details of sample size in studies developing prediction models for binary outcomes using machine learning (ML) methods within oncology and compared the sample size used to develop the models with the minimum required sample size needed when developing a regression-based model (N).

Methods: We searched the Medline (via OVID) database for studies developing a prediction model using ML methods published in December 2022. We reviewed how sample size was justified. We calculated N, which is the minimum required sample size needed when developing a regression-based model, and compared this with the sample size that was used to develop the models.

Results: Only one of 36 included studies justified their sample size. We were able to calculate N for 17 (47%) studies. 5/17 studies met N, allowing to precisely estimate the overall risk and minimise overfitting. There was a median deficit of 302 participants with the event (n= 17; range: -21331 to 2298) when developing the ML models. An additional three out of the 17 studies met the required sample size to precisely estimate the overall risk only.

Conclusion: Studies developing a prediction model using ML in oncology seldom justified their sample size and sample sizes were often smaller than N. As ML models almost certainly require a larger sample size than regression models, the deficit is likely larger. We recommend that researchers consider and report their sample size and at least meet the minimum sample size required when developing a regression-based model.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jclinepi.2025.111675DOI Listing

Publication Analysis

Top Keywords

sample size
52
prediction model
16
sample
14
size
13
needed developing
12
studies developing
12
developing prediction
12
required sample
12
developing regression-based
12
regression-based model
12

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!