Estimating a predictive model from a dataset is best initiated with an unbiased estimator. However, since the unbiased estimator is unknown in general, the problem of the bias-variance tradeoff is raised. Aside from searching for an unbiased estimator, the convenient approach to the problem of the bias-variance tradeoff may be to use the clustering method. Within a cluster whose size is smaller than the whole sample, we would expect the simple form of the estimator for prediction to avoid the overfitting problem. In this paper, we propose a new method to find the optimal cluster for prediction. Based on the previous literature, this cluster is considered to exist somewhere between the whole dataset and the typical cluster determined by partitioning data. To obtain a reliable cluster size, we use the bootstrap method in this paper. Additionally, through experiments with simulated and real-world data, we show that the prediction error can be reduced by applying this new method. We believe that our proposed method will be useful in many applications using a clustering algorithm for a stable prediction performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6827892PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223529PLOS

Publication Analysis

Top Keywords

unbiased estimator
12
cluster prediction
8
prediction based
8
problem bias-variance
8
bias-variance tradeoff
8
cluster size
8
cluster
6
prediction
5
method
5
optimally adjusted
4

Similar Publications

Background: It is known that illicit and prescribed drugs impact pupil size, eye movement and function. Still, comprehensive quantitative evaluations under known ambient light conditions are lacking, when smartphones are used for monitoring.

Methods: In this clinical study (NCT05731999), four medicinal products with addiction risks were administered to 48 subjects (18-70 years old, all with informed consent, 12 subjects per drug).

View Article and Find Full Text PDF

Efficiently obtaining atomic-scale thermodynamic parameters characterizing crystallization from solution is key to developing the modeling strategies needed in the quest for digital design strategies for industrial crystallization processes. Based on the thermodynamics of crystal nucleation in confined solutions, we develop a simulation framework to efficiently estimate the solubility and surface tension of organic crystals in solution from a few unbiased molecular dynamics simulations at a reference temperature. We then show that such a result can be extended with minimal computational overhead to capture the solubility curve.

View Article and Find Full Text PDF

Aim: Species age, the elapsed time since origination, can give insight into how species longevity might influence eco-evolutionary dynamics, which has been hypothesized to influence extinction risk. Traditionally, species' ages have been estimated from fossil records. However, numerous studies have recently used the branch lengths of time-calibrated phylogenies as estimates of the ages of extant species.

View Article and Find Full Text PDF

Cardio Vascular Disease (CVD) is one of the leading causes of mortality and it is estimated that 1 in 4 deaths happens due to it. The disease prevalence rate becomes higher since there is an inadequate system/model for predicting CVD at an earliest. Diabetic Retinopathy (DR) is a kind of eye disease was associated with increasing risk factors for all-causes of CVD events.

View Article and Find Full Text PDF

Root system architecture (RSA) plays an important role in plant adaptation to drought stress. However, the genetic basis of RSA in sorghum has not been adequately elucidated. This study aimed to investigate the genetic bases of RSA traits through genome-wide association studies (GWAS) and determine genomic prediction (GP) accuracy in sorghum landraces at the seedling stage.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!