Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high-dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally efficient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high-dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it (1) learns rapidly with second-order learning methods based on incremental training, (2) uses statistically sound stochastic leave-one-out cross validation for learning without the need to memorize training data, (3) adjusts its weighting kernels based on only local information in order to minimize the danger of negative interference of incremental learning, (4) has a computational complexity that is linear in the number of inputs, and (5) can deal with a large number of-possibly redundant-inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and efficiently operate in very high-dimensional spaces.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1162/089976605774320557 | DOI Listing |
Microbial communities play a central role in transforming environments across Earth, driving both physical and chemical changes. By harnessing these capabilities, synthetic microbial communities, assembled from the bottom up, offer valuable insights into the mechanisms that govern community functions. These communities can also be tailored to produce desired outcomes, such as the synthesis of health-related metabolites or nitrogen fixation to improve plant productivity.
View Article and Find Full Text PDFJ Am Med Inform Assoc
January 2025
Kennewick, WA 99338, United States.
Objective: This study evaluates the utility of word embeddings, generated by large language models (LLMs), for medical diagnosis by comparing the semantic proximity of symptoms to their eponymic disease embedding ("eponymic condition") and the mean of all symptom embeddings associated with a disease ("ensemble mean").
Materials And Methods: Symptom data for 5 diagnostically challenging pediatric diseases-CHARGE syndrome, Cowden disease, POEMS syndrome, Rheumatic fever, and Tuberous sclerosis-were collected from PubMed. Using the Ada-002 embedding model, disease names and symptoms were translated into vector representations in a high-dimensional space.
Nature
January 2025
Department of Physics, The Hong Kong University of Science and Technology, Kowloon, Hong Kong, China.
The concept of non-Hermiticity has expanded the understanding of band topology, leading to the emergence of counter-intuitive phenomena. An example is the non-Hermitian skin effect (NHSE), which involves the concentration of eigenstates at the boundary. However, despite the potential insights that can be gained from high-dimensional non-Hermitian quantum systems in areas such as curved space, high-order topological phases and black holes, the realization of this effect in high dimensions remains unexplored.
View Article and Find Full Text PDFPLoS One
January 2025
Shanghai Jiao Tong University, Shanghai, China.
Virtual machine logs are generated in large quantities. Virtual machine logs may contain some abnormal logs that indicate security risks or system failures of the virtual machine platform. Therefore, using unsupervised anomaly detection methods to identify abnormal logs is a meaningful task.
View Article and Find Full Text PDFPLoS Comput Biol
December 2024
Communication Science Laboratories, NTT Corporation, Kyoto, Japan.
Spike train modeling across large neural populations is a powerful tool for understanding how neurons code information in a coordinated manner. Recent studies have employed marked point processes in neural population modeling. The marked point process is a stochastic process that generates a sequence of events with marks.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!