Objective: Compare the spike detection performance of three skilled humans and three computer algorithms.

Methods: 40 prolonged EEGs, 35 containing reported spikes, were evaluated. Spikes and sharp waves were marked by the humans and algorithms. Pairwise sensitivity and false positive rates were calculated for each human-human and algorithm-human pair. Differences in human pairwise performance were calculated and compared to the range of algorithm versus human performance differences as a type of statistical Turing test.

Results: 5474 individual spike events were marked by the humans. Mean, pairwise human sensitivities and false positive rates were 40.0%, 42.1%, and 51.5%, and 0.80, 0.97, and 1.99/min. Only the Persyst 13 (P13) algorithm was comparable to humans - 43.9% and 1.65/min. Evaluation of pairwise differences in sensitivity and false positive rate demonstrated that P13 met statistical noninferiority criteria compared to the humans.

Conclusion: Humans had only a fair level of agreement in spike marking. The P13 algorithm was statistically noninferior to the humans.

Significance: This was the first time that a spike detection algorithm and humans performed similarly. The performance comparison methodology utilized here is generally applicable to problems in which skilled human performance is the desired standard and no external gold standard exists.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.clinph.2016.11.005DOI Listing

Publication Analysis

Top Keywords

spike detection
12
false positive
12
statistical turing
8
marked humans
8
sensitivity false
8
positive rates
8
human performance
8
p13 algorithm
8
humans
6
spike
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!