Time series analysis of hemorrhagic fever with renal syndrome in mainland China by using an XGBoost forecasting model.

BMC Infect Dis

Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, China.

Published: August 2021

Background: Hemorrhagic fever with renal syndrome (HFRS) is still attracting public attention because of its outbreak in various cities in China. Predicting future outbreaks or epidemics disease based on past incidence data can help health departments take targeted measures to prevent diseases in advance. In this study, we propose a multistep prediction strategy based on extreme gradient boosting (XGBoost) for HFRS as an extension of the one-step prediction model. Moreover, the fitting and prediction accuracy of the XGBoost model will be compared with the autoregressive integrated moving average (ARIMA) model by different evaluation indicators.

Methods: We collected HFRS incidence data from 2004 to 2018 of mainland China. The data from 2004 to 2017 were divided into training sets to establish the seasonal ARIMA model and XGBoost model, while the 2018 data were used to test the prediction performance. In the multistep XGBoost forecasting model, one-hot encoding was used to handle seasonal features. Furthermore, a series of evaluation indices were performed to evaluate the accuracy of the multistep forecast XGBoost model.

Results: There were 200,237 HFRS cases in China from 2004 to 2018. A long-term downward trend and bimodal seasonality were identified in the original time series. According to the minimum corrected akaike information criterion (CAIC) value, the optimal ARIMA (3, 1, 0) × (1, 1, 0) model is selected. The index ME, RMSE, MAE, MPE, MAPE, and MASE indices of the XGBoost model were higher than those of the ARIMA model in the fitting part, whereas the RMSE of the XGBoost model was lower. The prediction performance evaluation indicators (MAE, MPE, MAPE, RMSE and MASE) of the one-step prediction and multistep prediction XGBoost model were all notably lower than those of the ARIMA model.

Conclusions: The multistep XGBoost prediction model showed a much better prediction accuracy and model stability than the multistep ARIMA prediction model. The XGBoost model performed better in predicting complicated and nonlinear data like HFRS. Additionally, Multistep prediction models are more practical than one-step prediction models in forecasting infectious diseases.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8377883PMC
http://dx.doi.org/10.1186/s12879-021-06503-yDOI Listing

Publication Analysis

Top Keywords

xgboost model
24
model
16
arima model
16
prediction
12
multistep prediction
12
one-step prediction
12
prediction model
12
xgboost
11
time series
8
hemorrhagic fever
8

Similar Publications

Background: To investigate the risk factors for readmission of elderly patients with coronary artery disease, and to construct and validate a predictive model for readmission risk of elderly patients with coronary artery disease within 3 years by applying machine learning method.

Methods: We selected 575 elderly patients with CHD admitted to the Affiliated Lu'an Hospital of Anhui Medical University from January 2020 to January 2023. Based on whether patients were readmitted within 3 years, they were divided into two groups: those readmitted within 3 years (215 patients) and those not readmitted within 3 years (360 patients).

View Article and Find Full Text PDF

Machine learning and clinician predictions of antibiotic resistance in Enterobacterales bloodstream infections.

J Infect

December 2024

Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, UK; NIHR Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford, UK; Oxford University Hospitals NHS Foundation Trust, Oxford, UK. Electronic address:

Background: Patients with Gram-negative bloodstream infections are at risk of serious adverse outcomes without active treatment, but identifying who has antimicrobial resistance (AMR) to target empirical treatment is challenging.

Methods: We used XGBoost machine learning models to predict antimicrobial resistance to seven antibiotics in patients with Enterobacterales bloodstream infection. Models were trained using hospital and community data from Oxfordshire, UK, for patients with positive blood cultures between 01-January-2017 and 31-December-2021.

View Article and Find Full Text PDF

VAE-IF: Deep feature extraction with averaging for fully unsupervised artifact detection in routinely acquired ICU time-series.

Comput Biol Med

December 2024

Institute for Imaging, Data and Communications (IDCOM), School of Engineering, University of Edinburgh, Edinburgh, EH9 3FB, UK.

Artifacts are a common problem in physiological time series collected from intensive care units (ICU) and other settings. They affect the quality and reliability of clinical research and patient care. Manual annotation of artifacts is costly and time-consuming, rendering it impractical.

View Article and Find Full Text PDF

Using Machine Learning to Predict Weight Gain in Adults: an Observational Analysis From the All of Us Research Program.

J Surg Res

December 2024

Department of Surgery, University of Wisconsin, Madison, Wisconsin; Department of Surgery, William S. Middleton Memorial VA, Madison, Wisconsin. Electronic address:

Introduction: Obesity, defined as a body mass index ≥30 kg/m, is a major public health concern in the United States. Preventative approaches are essential, but they are limited by an inability to accurately predict individuals at highest risk of weight gain. Our objective was to develop accurate weight gain prediction models using the National Institutes of Health All of Us dataset.

View Article and Find Full Text PDF

Background: Liver hepatocellular carcinoma (LIHC) ranks as the foremost cause of cancer-related deaths worldwide, and its early detection poses considerable challenges. Current prognostic indicators, including alpha-fetoprotein, have notable limitations in their clinical utility, thereby underscoring the necessity for discovering new biomarkers to improve early diagnosis and enable personalized treatment options.

Method: This investigation employed single-cell analysis techniques to identify stem cell-associated genes and assess their prognostic significance for LIHC patients, as well as the efficacy of immunotherapy, utilizing nonnegative matrix factorization (NMF) cluster analysis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!