Human-in-The-Loop Sim-to-Real Transfer Policy for Robotic Assembly via Reinforcement Learning

REN, Sirui, ZENG, Chao, LI, Zhiyi, YANG, Chenguang and WANG, Ning (2026). Human-in-The-Loop Sim-to-Real Transfer Policy for Robotic Assembly via Reinforcement Learning. In: 2025 8th International Conference on Robotics, Control and Automation Engineering (RCAE 2025). IEEE, 121-126. [Book Section]

Documents
37317:1246960
[thumbnail of Human-in-The-Loop Sim-to-Real Transfer Policy for Robotic Assembly via Reinforcement Learning.pdf]
Preview
PDF
Human-in-The-Loop Sim-to-Real Transfer Policy for Robotic Assembly via Reinforcement Learning.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (823kB) | Preview
Abstract
In complex robotic tasks, reinforcement learning (RL) algorithms have garnered significant attention for their ability to dynamically adapt to environmental changes and optimize control policies. However, challenges such as sparse rewards, poor generalization capability, low sample efficiency, and the sim-to-real transfer gap continue to hinder the widespread industrial application of RL. To address these challenges, we propose a novel reinforcement learning framework that integrates Deep Deterministic Policy Gradient (DDPG), Hindsight Experience Replay (HER), and Behavior Cloning (BC) to efficiently solve robotic assembly tasks in sparse reward environments. Hindsight Experience Replay (HER) addresses sparse rewards by relabeling failed experiences as successes, enabling broader exploration of target states and faster policy learning. Combined with Behavior Cloning (BC), which uses human demonstrations to reduce exploration needs, the proposed approach effectively enhances learning efficiency and generalization in complex robotic tasks. To validate the proposed algorithm, we implemented a benchmark robotic assembly environment in the MuJoCo simulator. Experimental results show that the proposed framework significantly outperforms baseline methods in key metrics, including training speed, task success rate, and assembly efficiency. Furthermore, we developed and validated an online human-in-the-loop correction-based sim-toreal transfer strategy. By leveraging a small amount of human correction data, this strategy effectively bridges the sim-to-real gap, enabling the model to exhibit robust performance and strong generalization in real-world robotic assembly tasks.
More Information
Statistics

Downloads

Downloads per month over past year

View more statistics

Metrics

Altmetric Badge

Dimensions Badge

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Actions (login required)

View Item View Item