In recent times, significant advancements have been made in delving into the optimization landscape of policy gradient methods for achieving optimal control in linear time-invariant (LTI) systems. Compared with state-feedback control, output-feedback control is more prevalent since the underlying state of the system may not be fully observed in many practical settings. This article analyzes the optimization landscape inherent to policy gradient methods when applied to static output feedback (SOF) control in discrete-time LTI systems subject to quadratic cost. We begin by establishing crucial properties of the SOF cost, encompassing coercivity, L -smoothness, and M -Lipschitz continuous Hessian. Despite the absence of convexity, we leverage these properties to derive novel findings regarding convergence (and nearly dimension-free rate) to stationary points for three policy gradient methods, including the vanilla policy gradient method, the natural policy gradient method, and the Gauss-Newton method. Moreover, we provide proof that the vanilla policy gradient method exhibits linear convergence toward local minima when initialized near such minima. This article concludes by presenting numerical examples that validate our theoretical findings. These results not only characterize the performance of gradient descent for optimizing the SOF problem but also provide insights into the effectiveness of general policy gradient methods within the realm of reinforcement learning.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCYB.2023.3323316DOI Listing

Publication Analysis

Top Keywords

policy gradient
32
gradient methods
20
optimization landscape
12
gradient method
12
gradient
9
policy
8
landscape policy
8
static output
8
output feedback
8
lti systems
8

Similar Publications

The expansion of urban settlements over native environments may expose biodiversity to a host of emerging contaminants, with unintended ecological effects. This study evaluated patterns of contamination of streamwater by antidepressants in the Upper Tietê River Basin, a watershed of high social, economic and environmental relevance for comprising both the largest urban settlement in South America (the Metropolitan Region of São Paulo) and remnants of a globally important biodiversity hotspot (the Atlantic Rainforest). We sampled 53 third-order streams draining catchments regularly distributed across a gradient in urban cover.

View Article and Find Full Text PDF

As global fertilizer application rates increase, high-quality datasets are paramount for comprehensive analyses to support informed decision-making and policy formulation in crucial areas such as food security or climate change. This study aims to fill existing data gaps by employing two machine learning models, eXtreme Gradient Boosting and HistGradientBoosting algorithms to produce precise country-level predictions of nitrogen (N), phosphorus pentoxide (PO), and potassium oxide (KO) application rates. Subsequently, we created a comprehensive dataset of 5-arcmin resolution maps depicting the application rates of each fertilizer for 13 major crop groups from 1961 to 2019.

View Article and Find Full Text PDF

Observation-based verification of regional/national methane (CH) emission trends is crucial for transparent monitoring and mitigation strategy planning. Although surface observations track the global and sub-hemispheric emission trends well, their sparse spatial coverage limits our ability to assess regional trends. Dense satellite observations complement surface observations, offering a valuable means to validate emission trends, especially in regions where emissions changes are substantial but debated.

View Article and Find Full Text PDF

In the context of evolutionary time, cities are an extremely recent development. Although our understanding of how urbanization alters ecosystems is well-developed, empirical work examining the consequences of urbanization on adaptive evolution remains limited. To facilitate future work, we offer candidate genes for one of the most prominent urban carnivores across North America.

View Article and Find Full Text PDF

The socioeconomic burden of cervical cancer and its implications for strategies required to achieve the WHO elimination targets.

Expert Rev Pharmacoecon Outcomes Res

January 2025

Evaluation and Implementation Science Unit, Centre for Health Policy, Melbourne School of Population and Global Health, University of Melbourne, Victoria, Australia.

Introduction: Cervical cancer is almost entirely preventable by vaccination and screening. Population based vaccination and screening programs are effective and cost effective, but millions of people do not have access to these programs, causing immense suffering. The WHO Global Strategy for the elimination of cervical cancer as a public health problem calls for countries to meet ambitious vaccination, screening and treatment targets.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!