The bias-variance trade-off is a central concept in supervised learning. In classical statistics, increasing the complexity of a model (e.g., number of parameters) reduces bias but also increases variance. Until recently, it was commonly believed that optimal performance is achieved at intermediate model complexities which strike a balance between bias and variance. Modern Deep Learning methods flout this dogma, achieving state-of-the-art performance using "over-parameterized models" where the number of fit parameters is large enough to perfectly fit the training data. As a result, understanding bias and variance in over-parameterized models has emerged as a fundamental problem in machine learning. Here, we use methods from statistical physics to derive analytic expressions for bias and variance in two minimal models of over-parameterization (linear regression and two-layer neural networks with nonlinear data distributions), allowing us to disentangle properties stemming from the model architecture and random sampling of data. In both models, increasing the number of fit parameters leads to a phase transition where the training error goes to zero and the test error diverges as a result of the variance (while the bias remains finite). Beyond this threshold, the test error of the two-layer neural network decreases due to a monotonic decrease in the bias and variance in contrast with the classical bias-variance trade-off. We also show that in contrast with classical intuition, over-parameterized models can overfit even in the absence of noise and exhibit bias even if the student and teacher models match. We synthesize these results to construct a holistic understanding of generalization error and the bias-variance trade-off in over-parameterized models and relate our results to random matrix theory.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879296 | PMC |
http://dx.doi.org/10.1103/physrevresearch.4.013201 | DOI Listing |
Sci Rep
December 2024
Department of Geosciences, Geotechnology, and Materials Engineering for Resources, Graduate School of International Resource Sciences, Akita University, Akita, Japan.
The present investigation employs relevance vector machine (RVM) and long short-term memory (LSTM) models to predict the time-dependent bearing capacity of concrete piles. Each RVM model (SRVM) is configured by each linear, polynomial, gaussian, sigmoid, laplacian, and exponential kernel function. Each SRVM model has been optimized by each genetic (GA_SRVM) and particle swarm optimization (PSO_RVM) algorithm.
View Article and Find Full Text PDFSci Rep
December 2024
Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands.
Medical datasets are vital for advancing Artificial Intelligence (AI) in healthcare. Yet biases in these datasets on which deep-learning models are trained can compromise reliability. This study investigates biases stemming from dataset-creation practices.
View Article and Find Full Text PDFClin Chim Acta
December 2024
Department of Laboratory Medicine, Peking Union Medical College Hospital, Peking Union Medical College & Chinese Academy of Medical Science, Beijing 100730, China; State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College & Chinese Academy of Medical Science, Beijing 100730, China.
Background: Serum protein electrophoresis (SPE) is essential for diagnosing monoclonal gammopathies and a variety of other diseases. Despite its importance, there is a scarcity of SPE parameter reference intervals (RIs) derived from large datasets. This study seeks to fill this gap by establishing sex-specific RIs using Hoffmann and refineR algorithms and assessing the feasibility of these methods.
View Article and Find Full Text PDFVet Med Sci
January 2025
Department of Veterinary Science, College of Agriculture and Environmental Sciences, Debre Tabor University, Debre Tabor, Ethiopia.
Background: Fasciolosis is a prevalent disease that significantly impairs the health and productivity of cattle and causes significant economic damage. Beyond the individually available studies with varying prevalence rates, there are no pooled national prevalence studies on bovine fasciolosis. Therefore, the current study aims to determine the pooled prevalence and economic significance of fasciolosis among cattle in Ethiopia.
View Article and Find Full Text PDFJ Diabetes Metab Disord
June 2025
Aragón Health Research Institute, University of Zaragoza Faculty of Medicine, Domingo Miral s/n, Zaragoza, 50009 Spain.
Purpose: We performed a systematic review and meta-analysis to examine the associations between telomere length and telomerase activity in subjects with and without metabolic syndrome (MetS).
Methods: The meta-analysis protocol was registered in the PROSPERO database. The PubMed, Embase, Cochrane Library, and LILACS databases were searched for studies reporting telomere length or telomerase activity in adult men and non-pregnant women with and without MetS.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!