By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4942627 | PMC |
http://dx.doi.org/10.1155/2016/2305854 | DOI Listing |
IEEE Trans Neural Netw Learn Syst
April 2024
Deep reinforcement learning (DRL) benefits from the representation power of deep neural networks (NNs), to approximate the value function and policy in the learning process. Batch reinforcement learning (BRL) benefits from stable training and data efficiency with fixed representation and enjoys solid theoretical analysis. This work proposes least-squares deep policy gradient (LSDPG), a hybrid approach that combines least-squares reinforcement learning (RL) with online DRL to achieve the best of both worlds.
View Article and Find Full Text PDFEntropy (Basel)
September 2022
Department of Engineering, King's College London, London WC2R 2LS, UK.
An efficient data-driven prediction strategy for multi-antenna frequency-selective channels must operate based on a small number of pilot symbols. This paper proposes novel channel-prediction algorithms that address this goal by integrating transfer and meta-learning with a reduced-rank parametrization of the channel. The proposed methods optimize linear predictors by utilizing data from previous frames, which are generally characterized by distinct propagation characteristics, in order to enable fast training on the time slots of the current frame.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
October 2023
This article presents a reinforcement learning (RL) algorithm for achieving model-free control of robotic applications. The RL functions are adapted with the least-square temporal difference (LSTD) learning algorithms to develop a model-free state feedback controller by establishing linear quadratic regulator (LQR) as a baseline controller. The classical least-square policy iteration technique is adapted to establish the boundary conditions for complexities incurred by the learning algorithm.
View Article and Find Full Text PDFIn this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation.
View Article and Find Full Text PDFComput Intell Neurosci
February 2017
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!