Self-Supervised Voice Denoising Network for Multi-Scenario Human-Robot Interaction.

Li, Mu; Xu, Wenjin; Zeng, Chao; Wang, Ning

Self-Supervised Voice Denoising Network for Multi-Scenario Human-Robot Interaction.

Tools

LI, Mu, XU, Wenjin, ZENG, Chao and WANG, Ning (2025). Self-Supervised Voice Denoising Network for Multi-Scenario Human-Robot Interaction. Biomimetics, 10 (9): 603. [Article]

[+][-]

Documents

36218:1060518

[+][-]

36218:1060518

[thumbnail of Wang-SelfSupervisedDenoisingNetwork(VoR).pdf]

Preview

PDF
Wang-SelfSupervisedDenoisingNetwork(VoR).pdf - Published Version
Available under License Creative Commons Attribution.

Download (4MB) | Preview

Abstract

Human-robot interaction (HRI) via voice command has significantly advanced in recent years, with large Vision-Language-Action (VLA) models demonstrating particular promise in human-robot voice interaction. However, these systems still struggle with environmental noise contamination during voice interaction and lack a specialized denoising network for multi-speaker command isolation in an overlapping speech scenario. To overcome these challenges, we introduce a method to enhance voice command-based HRI in noisy environments, leveraging synthetic data and a self-supervised denoising network to enhance its real-world applicability. Our approach focuses on improving self-supervised network performance in denoising mixed-noise audio through training data scaling. Extensive experiments show our method outperforms existing approaches in simulation and achieves 7.5% higher accuracy than the state-of-the-art method in noisy real-world environments, enhancing voice-guided robot control.

More Information

Official URL:

https://doi.org/10.3390/biomimetics10090603

Uncontrolled Keywords:

data synthesis; human–robot interaction; self-supervised learning; voice denoising

Event Location:

Switzerland

Identifiers

Identification Number:

10.3390/biomimetics10090603

Library

Publisher:

MDPI AG

Item Type:

Article

SWORD Depositor:

Symplectic Elements

Depositing User:

Symplectic Elements

Date record made live:

07 Oct 2025 10:53

Last Modified:

10 Oct 2025 06:00

Date of first compliant deposit:

7 October 2025

Date of first compliant Open Access:

7 October 2025

Version of first compliant deposit:

Version of Record

ISSN:

2313-7673

Volume:

URI:

https://shura.shu.ac.uk/id/eprint/36218

Statistics

Downloads

Downloads per month over past year

View more statistics

Metrics

Actions (login required)

View Item

Sheffield Hallam University Research Archive

Self-Supervised Voice Denoising Network for Multi-Scenario Human-Robot Interaction.

Downloads

Altmetric Badge

Dimensions Badge

Actions (login required)

Sheffield Hallam University

City Campus, Howard Street

Sheffield S1 1WB

Sheffield Hallam University Research Archive

Contact us: shura@shu.ac.uk

Research at SHU

SHU Library