IEEE Trans Neural Netw Learn Syst
December 2023
Undiscounted return is an important setup in reinforcement learning (RL) and characterizes many real-world problems. However, optimizing an undiscounted return often causes training instability. The causes of this instability problem have not been analyzed in-depth by existing studies.
View Article and Find Full Text PDF