Learning control policies has become an appealing alternative to the derivation of control laws based on classic control theory. Model-based approaches have proven an outstanding data efficiency, especially when combined with probabilistic models to eliminate model bias. However, a major difficulty for these methods is that multi-step-ahead predictions typically become intractable for larger planning horizons and can only poorly be approximated. In this paper, we propose the use of numerical quadrature to overcome this drawback and provide significantly more accurate multi-step-ahead predictions. As a result, our approach increases data efficiency and enhances the quality of learned policies. Furthermore, policy learning is not restricted to optimizing locally around one trajectory, as numerical quadrature provides a principled approach to extend optimization to all trajectories starting in a specified starting state region. Thus, manual effort, such as choosing informative starting points for simultaneous policy optimization, is significantly decreased. Furthermore, learning is highly robust to the choice of initial policy and, thus, interaction time with the system is minimized. Empirical evaluations on simulated benchmark problems show the efficiency of the proposed approach and support our theoretical results.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2018.2879335 | DOI Listing |
Entropy (Basel)
January 2025
Quantum ICT Research Institute, Tamagawa University, Tokyo 194-8610, Japan.
The limitations of cloning and discriminating quantum states are related to the non-orthogonality of the states. Hence, understanding the collective features of quantum states is essential for the future development of quantum communications technology. This paper investigates the non-orthogonality of different coherent-state signal constellations used in quantum communications, namely phase-shift keying (PSK), quadrature-amplitude modulation (QAM), and a newly defined signal named the sunflower-like (SUN) coherent-state signal.
View Article and Find Full Text PDFMethodsX
December 2024
Institute of Computer Science, University of Silesia, Bedzinska 39, Sosnowiec, 41-200, Poland.
This study introduces a family of root-solvers for systems of nonlinear equations, leveraging the Daftardar-Gejji and Jafari Decomposition Technique coupled with the midpoint quadrature rule. Despite the existing application of these root solvers to single-variable equations, their extension to systems of nonlinear equations marks a pioneering advancement. Through meticulous derivation, this work not only expands the utility of these root solvers but also presents a comprehensive analysis of their stability and semilocal convergence; two areas of study missing in the existing literature.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Department of Civil Engineering, K. N. Toosi University of Technology, Tehran 19967-15433, Iran.
Forward modeling the magnetic effects of an inferred source is the basis of magnetic anomaly inversion for estimating subsurface magnetization parameters. This study uses numerical least-squares Gauss-Legendre quadrature (GLQ) integration to evaluate the magnetic potential, anomaly, and gradient components of a cylindrical prism element. Relative to previous studies, it quantifies for the first time the magnetic gradient components, enabling their applications in the interpretation of cylindrical bodies.
View Article and Find Full Text PDFEntropy (Basel)
November 2024
Department of Electrical Engineering & Computer Science, University of Kansas, Lawrence, KS 66045, USA.
Low-density parity-check (LDPC) codes form part of the IRIG-106 standard and have been successfully deployed for the Telemetry Group version of shaped-offset quadrature phase shift keying (SOQPSK-TG) modulation. Recently, LDPC code solutions have been proposed and optimized for continuous phase modulations (CPMs), including pulse code modulation/frequency modulation (PCM/FM) and the multi-h CPM developed by the Advanced-Range TeleMetry program (ARTM CPM), the latter of which was shown to perform around one dB from channel capacity. In this paper, we consider the effect of the random puncturing and shortening of these LDPC codes to further improve spectrum efficiency.
View Article and Find Full Text PDFPhys Rev E
November 2024
Graduate Aerospace Laboratories and Department of Applied Physics, California Institute of Technology, Pasadena, California 91125, USA.
Moment methods are often used to solve transport problems involving the Boltzmann-BGK equation. Because the moment equations are underdetermined, these methods require an additional "closure equation" that relates higher to lower-order moments. Here, we examine the closure equation and higher-order moment relations implicit in the lattice Boltzmann method (LBM) that use Gauss-Hermite quadrature for their discrete velocity sets.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!