The aims of supervised machine learning (ML) applications fall into three broad categories: classification, ranking, and calibration/probability estimation. Many ML methods and evaluation techniques relate to the first two. Nevertheless, there are many applications where having an accurate probability estimate is of great importance. Deriving accurate probabilities from the output of a ML method is therefore an active area of research, resulting in several methods to turn a ranking into class probability estimates. In this manuscript we present a method, splined empirical probabilities, based on the receiver operating characteristic (ROC) to complement existing algorithms such as isotonic regression. Unlike most other methods it works with a cumulative quantity, the ROC curve, and as such can be tagged onto an ROC analysis with minor effort. On a diverse set of measures of the quality of probability estimates (Hosmer-Lemeshow, Kullback-Leibler divergence, differences in the cumulative distribution function) using simulated and real health care data, our approach compares favourably with the standard calibration method, the pool adjacent violators algorithm used to perform isotonic regression.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10729-014-9267-1DOI Listing

Publication Analysis

Top Keywords

splined empirical
8
empirical probabilities
8
probability estimates
8
isotonic regression
8
classifier calibration
4
calibration splined
4
probabilities clinical
4
clinical risk
4
risk prediction
4
prediction aims
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!