KADALAGERE SAMPATH, Suhas, WANG, Ning, YANG, Chenguang, WU, Howard, LIU, Cunjia and PEARSON, Martin (2025). A Vision-Guided Deep Learning Framework for Dexterous Robotic Grasping Using Gaussian Processes and Transformers †. Applied Sciences, 15 (5). [Article]
Documents
35097:864097
PDF
applsci-15-02615-v2.pdf - Published Version
Available under License Creative Commons Attribution.
applsci-15-02615-v2.pdf - Published Version
Available under License Creative Commons Attribution.
Download (31MB) | Preview
Abstract
Robotic manipulation of objects with diverse shapes, sizes, and properties, especially deformable ones, remains a significant challenge in automation, necessitating human-like dexterity through the integration of perception, learning, and control. This study enhances a previous framework combining YOLOv8 for object detection and LSTM networks for adaptive grasping by introducing Gaussian Processes (GPs) for robust grasp predictions and Transformer models for efficient multi-modal sensory data integration. A Random Forest classifier also selects optimal grasp configurations based on object-specific features like geometry and stability. The proposed grasping framework achieved a 95.6% grasp success rate using Transformer-based force modulation, surpassing LSTM (91.3%) and GP (91.3%) models. Evaluation of a diverse dataset showed significant improvements in grasp force modulation, adaptability, and robustness for two- and three-finger grasps. However, limitations were observed in five-finger grasps for certain objects, and some classification failures occurred in the vision system. Overall, this combination of vision-based detection and advanced learning techniques offers a scalable solution for flexible robotic manipulation.
More Information
Statistics
Downloads
Downloads per month over past year
Metrics
Altmetric Badge
Dimensions Badge
Share
Actions (login required)
![]() |
View Item |