Background: It is difficult to detect the outbreak of emergency infectious disease based on the exiting surveillance system. Here we investigate the utility of the Baidu Search Index, an indicator of how large of a keyword is in Baidu's search volume, in the early warning and predicting the epidemic trend of COVID-19.

Methods: The daily number of cases and the Baidu Search Index of 8 keywords (weighted by population) from December 1, 2019 to March 15, 2020 were collected and analyzed with times series and Spearman correlation with different time lag. To predict the daily number of COVID-19 cases using the Baidu Search Index, Zero-inflated negative binomial regression was used in phase 1 and negative binomial regression model was used in phase 2 and phase 3 based on the characteristic of independent variable.

Results: The Baidu Search Index of all keywords in Wuhan was significantly higher than Hubei (excluded Wuhan) and China (excluded Hubei). Before the causative pathogen was identified, the search volume of "Influenza" and "Pneumonia" in Wuhan increased with the number of new onset cases, their correlation coefficient was 0.69 and 0.59, respectively. After the pathogen was public but before COVID-19 was classified as a notifiable disease, the search volume of "SARS", "Pneumonia", "Coronavirus" in all study areas increased with the number of new onset cases with the correlation coefficient was 0.69 ~ 0.89, while "Influenza" changed to negative correlated (r: -0.56 ~ -0.64). After COVID-19 was closely monitored, the Baidu Search Index of "COVID-19", "Pneumonia", "Coronavirus", "SARS" and "Mask" could predict the epidemic trend with 15 days, 5 days and 6 days lead time, respectively in Wuhan, Hubei (excluded Wuhan) and China (excluded Hubei). The predicted number of cases would increase 1.84 and 4.81 folds, respectively than the actual number of cases in Wuhan and Hubei (excluded Wuhan) from 21 January to 9 February.

Conclusion: The Baidu Search Index could be used in the early warning and predicting the epidemic trend of COVID-19, but the search keywords changed in different period. Considering the time lag from onset to diagnosis, especially in the areas with medical resources shortage, internet search data can be a highly effective supplement of the existing surveillance system.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11414173PMC
http://dx.doi.org/10.1186/s12879-024-09940-7DOI Listing

Publication Analysis

Top Keywords

baidu search
24
negative binomial
16
binomial regression
16
early warning
12
warning predicting
12
regression model
12
search volume
12
epidemic trend
12
number cases
12
search keywords
12

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!