Predictive modeling is becoming an essential tool for clinical decision support, but health systems with smaller sample sizes may construct suboptimal or overly specific models. Models become over-specific when beside true physiological effects, they also incorporate potentially volatile site-specific artifacts. These artifacts can change suddenly and can render the model unsafe. To obtain safer models, health systems with inadequate sample sizes may adopt one of the following options. First, they can use a generic model, such as one purchased from a vendor, but often such a model is not sufficiently specific to the patient population and is thus suboptimal. Second, they can participate in a research network. Paradoxically though, sites with smaller datasets contribute correspondingly less to the joint model, again rendering the final model suboptimal. Lastly, they can use transfer learning, starting from a model trained on a large data set and updating this model to the local population. This strategy can also result in a model that is over-specific. In this paper we present the consensus modeling paradigm, which uses the help of a large site (source) to reach a consensus model at the small site (target). We evaluate the approach on predicting postoperative complications at two health systems with 9,044 and 38,045 patients (rare outcomes at about 1% positive rate), and conduct a simulation study to understand the performance of consensus modeling relative to the other three approaches as a function of the available training sample size at the target site. We found that consensus modeling exhibited the least over-specificity at either the source or target site and achieved the highest combined predictive performance.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.artmed.2024.102899 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!