Objective: To design and assess a method to leverage individuals' temporal data for predicting their healthcare cost. To achieve this goal, we first used patients' temporal data in their fine-grain form as opposed to coarse-grain form. Second, we devised novel spike detection features to extract temporal patterns that improve the performance of cost prediction. Third, we evaluated the effectiveness of different types of temporal features based on cost information, visit information and medical information for the prediction task.
Materials And Methods: We used three years of medical and pharmacy claims data from 2013 to 2016 from a healthcare insurer, where the first two years were used to build the model to predict the costs in the third year. To prepare the data for modeling and prediction, the time series data of cost, visit and medical information were extracted in the form of fine-grain features (i.e., segmenting each time series into a sequence of consecutive windows and representing each window by various statistics such as sum). Then, temporal patterns of the time series were extracted and added to fine-grain features using a novel set of spike detection features (i.e., the fluctuation of data points). Gradient Boosting was applied on the final set of extracted features. Moreover, the contribution of each type of data (i.e., cost, visit and medical) was assessed. We benchmarked the proposed predictors against extant methods including those that used coarse-grain features which represent each time series with various statistics such as sum and the most recent portion of the values in the entire series. All prediction performances were measured in terms of Mean Absolute Percentage Error (MAPE).
Results: Gradient Boosting applied on fine-grain predictors outperformed coarse-grain predictors with a MAPE of 3.02 versus 8.14 (p < 0.01). Enhancing the fine-grain features with the temporal pattern extraction features (i.e., spike detection features) further improved the MAPE to 2.04 (p < 0.01). Removing cost, visit and medical status data resulted in MAPEs of 10.24, 2.22 and 2.07 respectively (p < 0.01 for the first two comparisons and p = 0.63 for the third comparison).
Conclusions: Leveraging fine-grain temporal patterns for healthcare cost prediction significantly improves prediction performance. Enhancing fine-grain features with extraction of temporal cost and visit patterns significantly improved the performance. However, medical features did not have a significant effect on prediction performance. Gradient Boosting outperformed all other prediction models.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.jbi.2019.103113 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!