Proc Natl Acad Sci U S A
July 2023
How do statistical dependencies in measurement noise influence high-dimensional inference? To answer this, we study the paradigmatic spiked matrix model of principal components analysis (PCA), where a rank-one matrix is corrupted by additive noise. We go beyond the usual independence assumption on the noise entries, by drawing the noise from a low-order polynomial orthogonal matrix ensemble. The resulting noise correlations make the setting relevant for applications but analytically challenging.
View Article and Find Full Text PDFWe consider increasingly complex models of matrix denoising and dictionary learning in the Bayes-optimal setting, in the challenging regime where the matrices to infer have a rank growing linearly with the system size. This is in contrast with most existing literature concerned with the low-rank (i.e.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
March 2019
Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors.
View Article and Find Full Text PDFBackground & Aims: In hereditary nonpolyposis colorectal cancer (HNPCC) syndrome, flat and small adenomas are particularly prone to malignant transformation but might be missed by standard colonoscopy. We prospectively studied the diagnostic yield of high-resolution colonoscopy coupled with chromoendoscopy for preneoplastic and neoplastic colorectal lesions in patients with HNPCC syndrome.
Methods: Thirty-six consecutive asymptomatic patients (mean age, 42 years) belonging to HNPCC families and receiving genetic counseling were enrolled in this prospective study.