Bias-variance decomposition of overparameterized regression with random linear features.

Phys Rev E

Department of Physics, Boston University, Boston, Massachusetts 02215, USA.

Published: August 2022

In classical statistics, the bias-variance trade-off describes how varying a model's complexity (e.g., number of fit parameters) affects its ability to make accurate predictions. According to this trade-off, optimal performance is achieved when a model is expressive enough to capture trends in the data, yet not so complex that it overfits idiosyncratic features of the training data. Recently, it has become clear that this classic understanding of the bias variance must be fundamentally revisited in light of the incredible predictive performance of overparameterized models-models that avoid overfitting even when the number of fit parameters is large enough to perfectly fit the training data. Here, we present results for one of the simplest examples of an overparameterized model: regression with random linear features (i.e., a two-layer neural network with a linear activation function). Using the zero-temperature cavity method, we derive analytic expressions for the training error, test error, bias, and variance. We show that the linear random features model exhibits three phase transitions: two different transitions to an interpolation regime where the training error is zero, along with an additional transition between regimes with large bias and minimal bias. Using random matrix theory, we show how each transition arises due to small nonzero eigenvalues in the Hessian matrix. Finally, we compare and contrast the phase diagram of the random linear features model to the random nonlinear features model and ordinary regression, highlighting the additional phase transitions that result from the use of linear basis functions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9906786PMC
http://dx.doi.org/10.1103/PhysRevE.106.025304DOI Listing

Publication Analysis

Top Keywords

random linear
12
linear features
12
features model
12
regression random
8
number fit
8
fit parameters
8
training data
8
bias variance
8
training error
8
phase transitions
8

Similar Publications

The effects of PM components on the cardiovascular disease admissions in Shanghai City, China: a multi- region study.

BMC Public Health

December 2024

Department of Hospital Infection Control, Tongji Hospital, School of Medicine, Tongji University, Shanghai, 200065, China.

Background: The burden of cardiovascular disease (CVD) is severe worldwide. Although many studies have investigated the association of particulate pollution with CVD, the effect of finer particulate pollution components on CVD remains unclear. This study aimed to explore the effect of five PM components ([Formula: see text], sulfate; [Formula: see text], nitrate; [Formula: see text], ammonium; OM, organic matter; BC, carbon black) on CVD admission in Shanghai City, identify the susceptible population, and provide clues for the prevention and control of particulate pollution.

View Article and Find Full Text PDF

Introduction: Considerable evidence suggests a pathophysiological role of neuroinflammation in psychiatric disorders. Lumbar puncture and positron emission tomography (PET) show increased levels of inflammation in psychiatric disorders. However, the invasive nature of these techniques, as well as their expense, make them undesirable for routine use in patients.

View Article and Find Full Text PDF

Aims: The aims of this study were to develop an automatic system capable of calculating four radiological measurements used in the diagnosis and monitoring of cerebral palsy (CP)-related hip disease, and to demonstrate that these measurements are sufficiently accurate to be used in clinical practice.

Methods: We developed a machine-learning system to automatically measure Reimer's migration percentage (RMP), acetabular index (ACI), head shaft angle (HSA), and neck shaft angle (NSA). The system automatically locates points around the femoral head and acetabulum on pelvic radiographs, and uses these to calculate measurements.

View Article and Find Full Text PDF

Effects of genetic strain, stocking density, and age on broiler behavior.

Poult Sci

December 2024

Department of Poultry Science, University of Arkansas, Fayetteville, Arkansas, USA 72701. Electronic address:

Fast growth rate and stocking density are global animal welfare concerns for broiler chickens. The objective of this study was to evaluate the effect of genetic strain and stocking density on the behavior of broilers. In a 2 × 2 randomized complete block design, conventional (CONV) and slow-growing (SG) broilers were stocked at either 29 kg/m (LO, n = 31 birds/pen) or 37 kg/m (HI, n = 40 birds/pen) in 16 pens (n = 4 pens/treatment).

View Article and Find Full Text PDF

More Than the Sum of Its Parts: Disrupted Core Periphery of Multiplex Brain Networks in Multiple Sclerosis.

Hum Brain Mapp

January 2025

Queen Square Multiple Sclerosis Centre, Department of Neuroinflammation, UCL Queen Square Institute of Neurology, University College London, London, UK.

Disruptions to brain networks, measured using structural (sMRI), diffusion (dMRI), or functional (fMRI) MRI, have been shown in people with multiple sclerosis (PwMS), highlighting the relevance of regions in the core of the connectome but yielding mixed results depending on the studied connectivity domain. Using a multilayer network approach, we integrated these three modalities to portray an enriched representation of the brain's core-periphery organization and explore its alterations in PwMS. In this retrospective cross-sectional study, we selected PwMS and healthy controls with complete multimodal brain MRI acquisitions from 13 European centers within the MAGNIMS network.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!