The paper proposes a new M2ACD(Multi-Actor-Critic Deep Deterministic Policy Gradient) algorithm to apply trajectory planning of the robotic manipulator in complex environments. First, the paper presents a general inverse kinematics algorithm that transforms the inverse kinematics problem into a general Newton-MP iterative method. The M2ACD algorithm based on multiple actors and critics is structured. The dual-actor network reduces the overestimation of action values, minimizes the correlation between the actor and value networks, and mitigates instability during the actor's selection process caused by excessively high Q-values. The dual-critic network reduces the estimation bias of Q-values, ensuring more reliable action selection and enhancing the stability of Q-value estimation. Secondly, The robotic manipulator's TSR (two-stage reward) strategy is designed and divided into the approach and close. Rewards in the approach phase focuses on safely and efficiently approaching the target, and rewards in the close phase involves final adjustments before contact is made with the target. Thirdly, to solve the position hopping jitter problem in traditional reinforcement learning trajectory planning, the NURBS(Non-Uniform Rational B-Splines) curve is used to smooth the hopping trajectory generated by M2ACD. Finally, the correctness of the M2ACD and the kinematics algorithm is verified by experiments. The M2ACD algorithm demonstrated superior curve smoothing, convergence stability and convergence speed compared to the TD3, DARC and DDPG algorithms. The M2ACD algorithm can be effectively applied to collaborative robots' trajectory planning, establishing a foundation for subsequent research.

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-025-93175-2DOI Listing

Publication Analysis

Top Keywords

trajectory planning
16
m2acd algorithm
12
reinforcement learning
8
learning trajectory
8
planning robotic
8
robotic manipulator
8
inverse kinematics
8
kinematics algorithm
8
network reduces
8
algorithm
6

Similar Publications

The paper proposes a new M2ACD(Multi-Actor-Critic Deep Deterministic Policy Gradient) algorithm to apply trajectory planning of the robotic manipulator in complex environments. First, the paper presents a general inverse kinematics algorithm that transforms the inverse kinematics problem into a general Newton-MP iterative method. The M2ACD algorithm based on multiple actors and critics is structured.

View Article and Find Full Text PDF

Spinal intervention can benefit from advancements in robotic systems, particularly in the field of Human-Robot Interaction (HRI). Despite the promising potential of these technologies, their integration into spine surgeries remains relatively limited, comprising mainly only selected procedures. Meanwhile, complex and time-consuming procedures, such as osteotomy, continue to be performed manually, significantly impacting surgeon workload and stress.

View Article and Find Full Text PDF

Background And Objectives: For 50 years, frame-based stereotactic brain biopsy has been the "gold standard" for its high diagnostic yield and safety, especially for complex or deep-seated lesions. Over the past decade, frameless and robotic alternatives have emerged. This report evaluates and compares the outcomes, diagnostic yield, and safety of these methods.

View Article and Find Full Text PDF

Obesity is associated with multiple noncommunicable diseases and has increased rapidly worldwide. Population obesity in China grew fourfold between 1993 and 2015, increasing most rapidly among children and adolescents. Cost-effective policies and programs delivered over time and at scale are required to change this trajectory, yet application of methodologies to identify such interventions have been sparse.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!