Severity: Warning
Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 176
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 316
Function: require_once
Objective: To investigate the impact of atrial flutter (A) in the atrial arrhythmias classification task. We additionally advocate the use of a subject-based split for future studies in the field in order to avoid within-subject correlation which may lead to over-optimistic inferences. Finally, we demonstrate the effectiveness of the classifiers outside of the initially studied circumstances, by performing an inter-dataset model evaluation of the classifiers in data from different sources.
Methods: ECG signals of two private and three public (two MIT-BIH and Chapman ecgdb) databases were preprocessed and divided into 10s segments which were then subject to feature extraction. The created datasets were divided into a training and test set in two ways, based on a random split and a patient split. Classification was performed using the XGBoost classifier, as well as two benchmark classification models using both data splits. The trained models were then used to make predictions on the test data of the remaining datasets.
Results: The XGBoost model yielded the best performance across all datasets compared to the remaining benchmark models, however variability in model performance was seen across datasets, with accuracy ranging from 70.6% to 89.4%, sensitivity ranging from 61.4% to 76.8%, and specificity ranging from 87.3% to 95.5%. When comparing the results between the patient and the random split, no significant difference was seen in the two private datasets and the Chapman dataset, where the number of samples per patient is low. Nonetheless, in the MIT-BIH dataset, where the average number of samples per patient is approximately 1300, a noticeable disparity was identified. The accuracy, sensitivity, and specificity of the random split in this dataset of 93.6%, 86.4%, and 95.9% respectively, were decreased to 88%, 61.4%, and 89.8% in the patient split, with the largest drop being in A sensitivity, from 71% to 5.4%. The inter-dataset scores were also significantly lower than their intra-dataset counterparts across all datasets.
Conclusions: CAD systems have great potential in the assistance of physicians in reliable, precise and efficient detection of arrhythmias. However, although compelling research has been done in the field, yielding models with excellent performances on their datasets, we show that these results may be over-optimistic. In our study, we give insight into the difficulty of detection of A on several datasets and show the need for a higher representation of A in public datasets. Furthermore, we show the necessity of a more structured evaluation of model performance through the use of a patient-based split and inter-dataset testing scheme to avoid the problem of within-subject correlation which may lead to misleadingly high scores. Finally, we stress the need for the creation and use of datasets with a higher number of patients and a more balanced representation of classes if we are to progress in this mission.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.cmpb.2022.106901 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!