From early in the coronavirus disease 2019 (COVID-19) pandemic, there was interest in using machine learning methods to predict COVID-19 infection status based on vocal audio signals, for example, cough recordings. However, early studies had limitations in terms of data collection and of how the performances of the proposed predictive models were assessed. This article describes how these limitations have been overcome in a study carried out by the Turing-RSS Health Data Laboratory and the UK Health Security Agency.
View Article and Find Full Text PDFThe UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up and help beat coronavirus' digital survey alongside demographic, symptom and self-reported respiratory condition data.
View Article and Find Full Text PDF