Toward high-precision robotic assembly in large workspaces using multimodal reinforcement learning

Ren, Sirui; Zeng, Chao; Yang, Chenguang; Wang, Ning

Toward high-precision robotic assembly in large workspaces using multimodal reinforcement learning

Tools

REN, Sirui, ZENG, Chao, YANG, Chenguang and WANG, Ning (2026). Toward high-precision robotic assembly in large workspaces using multimodal reinforcement learning. Robotic Intelligence and Automation. [Article]

[+][-]

Documents

37174:1220837

[+][-]

37174:1220837

Preview

PDF
ФH,.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (8MB) | Preview

Abstract

Purpose

This study aims to address the challenges of complex contact dynamics, structural constraints and perceptual uncertainty in robotic peg-in-hole assembly tasks, particularly under large workspace and high-precision requirements.

Design/methodology/approach

The authors propose a reinforcement learning framework that integrates multimodal perception, self-supervised representation modeling and hybrid control mechanisms. The framework takes visual images, proprioceptive states and target pose information as inputs. A self-supervised pretraining phase jointly optimizes image reconstruction and forward prediction to learn structurally aware and temporally consistent latent state representations. Furthermore, a Spectrum Random Masking (SRM) technique introduces frequency-domain perturbations to the visual modality, encouraging stable spectral feature learning and enhancing perception robustness for sim-to-real transfer. During execution, an adaptive impedance control mode is activated when excessive contact force is detected, mitigating insertion impacts and jamming.

Findings

Real-world experiments in extended operational spaces and across diverse peg-hole geometries demonstrate that the proposed method achieves millimeter-level accuracy, high success rates in seen configurations and strong zero-shot generalization to unseen shapes. These results validate the effectiveness of the framework in ensuring robust and precise robotic assembly across large and variable workspaces.

Originality/value

This work presents a novel reinforcement learning framework that integrates multimodal perception, spectral feature learning and hybrid control switching to tackle the challenges of high-precision peg-in-hole assembly. By introducing SRM for robust visual representation and combining it with adaptive impedance control, the framework offers a distinctive solution that enhances sim-to-real transfer and generalization in complex assembly scenarios.

More Information

Official URL:

https://www.emerald.com/ria/article/doi/10.1108/RI...

Uncontrolled Keywords:

Robotic assembly; Sim-to-real transfer; Adaptive impedance control; Multimodal reinforcement learning

Identifiers

Identification Number:

10.1108/ria-09-2025-0291

Library

Publisher:

Emerald

Item Type:

Article

SWORD Depositor:

Symplectic Elements

Depositing User:

Symplectic Elements

Date record made live:

23 Mar 2026 15:33

Last Modified:

23 Mar 2026 15:45

Date of first compliant deposit:

22 March 2026

Date of first compliant Open Access:

23 March 2026

Version of first compliant deposit:

Author Accepted Manuscript

ISSN:

2754-6969

URI:

https://shura.shu.ac.uk/id/eprint/37174

Statistics

Downloads

Downloads per month over past year

View more statistics

Metrics

Actions (login required)

View Item

Sheffield Hallam University Research Archive

Toward high-precision robotic assembly in large workspaces using multimodal reinforcement learning

Purpose

Design/methodology/approach

Findings

Originality/value

Downloads

Altmetric Badge

Dimensions Badge

Actions (login required)

Sheffield Hallam University

City Campus, Howard Street

Sheffield S1 1WB

Sheffield Hallam University Research Archive

Contact us: shura@shu.ac.uk

Research at SHU

SHU Library