Background: Methods to undertake diagnostic accuracy studies of administrative epilepsy data are challenged by lack of a way to reliably rank case-ascertainment algorithms in order of their accuracy. This is because it is difficult to know how to prioritise positive predictive value (PPV) and sensitivity (Sens). Large numbers of true negative (TN) instances frequently found in epilepsy studies make it difficult to discriminate algorithm accuracy on the basis of negative predictive value (NPV) and specificity (Spec) as these become inflated (usually >90%). This study demonstrates the complementary value of using weather forecasting or machine learning metrics critical success index (CSI) or F measure, respectively, as unitary metrics combining PPV and sensitivity. We reanalyse data published in a diagnostic accuracy study of administrative epilepsy mortality data in Scotland.

Method: CSI was calculated as 1/[(1/PPV) + (1/Sens) - 1]. F measure was calculated as 2.PPV.Sens/(PPV + Sens). CSI and F values range from 0 to 1, interpreted as 0 = inaccurate prediction and 1 = perfect accuracy. The published algorithms were reanalysed using these and their accuracy re-ranked according to CSI in order to allow comparison to the original rankings.

Results: CSI scores were conservative (range 0.02-0.826), always less than or equal to the lower of the corresponding PPV (range 39-100%) and sensitivity (range 2-93%). F values were less conservative (range 0.039-0.905), sometimes higher than either PPV or sensitivity, but were always higher than CSI. Low CSI and F values occurred when there was a large difference between PPV and sensitivity, e.g. CSI was 0.02 and F was 0.039 in an instance when PPV was 100% and sensitivity was 2%. Algorithms with both high PPV and sensitivity performed best in terms of CSI and F measure, e.g. CSI was 0.826 and F was 0.905 in an instance when PPV was 90% and sensitivity was 91%.

Conclusion: CSI or F measure can combine PPV and sensitivity values into a convenient single metric that is easier to interpret and rank in terms of diagnostic accuracy than trying to rank diagnostic accuracy according to the two measures themselves. CSI or F prioritise instances where both PPV and sensitivity are high over instances where there are large differences between PPV and sensitivity (even if one of these is very high), allowing diagnostic accuracy thresholds based on combined PPV and sensitivity to be determined. Therefore, CSI or F measures may be helpful complementary metrics to report alongside PPV and sensitivity in diagnostic accuracy studies of administrative epilepsy data.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.eplepsyres.2023.107275DOI Listing

Publication Analysis

Top Keywords

ppv sensitivity
40
diagnostic accuracy
24
ppv
13
sensitivity
13
csi
13
administrative epilepsy
12
csi measure
12
accuracy
11
critical success
8
accuracy studies
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!