Estimating a predictive model from a dataset is best initiated with an unbiased estimator. However, since the unbiased estimator is unknown in general, the problem of the bias-variance tradeoff is raised. Aside from searching for an unbiased estimator, the convenient approach to the problem of the bias-variance tradeoff may be to use the clustering method. Within a cluster whose size is smaller than the whole sample, we would expect the simple form of the estimator for prediction to avoid the overfitting problem. In this paper, we propose a new method to find the optimal cluster for prediction. Based on the previous literature, this cluster is considered to exist somewhere between the whole dataset and the typical cluster determined by partitioning data. To obtain a reliable cluster size, we use the bootstrap method in this paper. Additionally, through experiments with simulated and real-world data, we show that the prediction error can be reduced by applying this new method. We believe that our proposed method will be useful in many applications using a clustering algorithm for a stable prediction performance.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6827892 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223529 | PLOS |
Front Neurosci
January 2025
Kontigo Care AB, Uppsala, Sweden.
Background: It is known that illicit and prescribed drugs impact pupil size, eye movement and function. Still, comprehensive quantitative evaluations under known ambient light conditions are lacking, when smartphones are used for monitoring.
Methods: In this clinical study (NCT05731999), four medicinal products with addiction risks were administered to 48 subjects (18-70 years old, all with informed consent, 12 subjects per drug).
Ind Eng Chem Res
January 2025
Thomas Young Centre and Department of Chemical Engineering, University College London, London WC1E 7JE, U.K.
Efficiently obtaining atomic-scale thermodynamic parameters characterizing crystallization from solution is key to developing the modeling strategies needed in the quest for digital design strategies for industrial crystallization processes. Based on the thermodynamics of crystal nucleation in confined solutions, we develop a simulation framework to efficiently estimate the solubility and surface tension of organic crystals in solution from a few unbiased molecular dynamics simulations at a reference temperature. We then show that such a result can be extended with minimal computational overhead to capture the solubility curve.
View Article and Find Full Text PDFGlob Ecol Biogeogr
October 2024
Department of Biology, University of Fribourg, Switzerland and Swiss Institute of Bioinformatics, Fribourg, Switzerland.
Aim: Species age, the elapsed time since origination, can give insight into how species longevity might influence eco-evolutionary dynamics, which has been hypothesized to influence extinction risk. Traditionally, species' ages have been estimated from fossil records. However, numerous studies have recently used the branch lengths of time-calibrated phylogenies as estimates of the ages of extant species.
View Article and Find Full Text PDFBMC Cardiovasc Disord
January 2025
Department of Computer Science and Engineering, SRM Institute of Science and Technology, Vadapalani Campus, Chennai, India.
Cardio Vascular Disease (CVD) is one of the leading causes of mortality and it is estimated that 1 in 4 deaths happens due to it. The disease prevalence rate becomes higher since there is an inadequate system/model for predicting CVD at an earliest. Diabetic Retinopathy (DR) is a kind of eye disease was associated with increasing risk factors for all-causes of CVD events.
View Article and Find Full Text PDFBMC Plant Biol
January 2025
Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
Root system architecture (RSA) plays an important role in plant adaptation to drought stress. However, the genetic basis of RSA in sorghum has not been adequately elucidated. This study aimed to investigate the genetic bases of RSA traits through genome-wide association studies (GWAS) and determine genomic prediction (GP) accuracy in sorghum landraces at the seedling stage.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!