Learning an end-to-end spatial grasp generation and refinement algorithm from simulation
- PDF / 3,492,579 Bytes
- 12 Pages / 595.276 x 790.866 pts Page_size
- 46 Downloads / 169 Views
ORIGINAL PAPER
Learning an end-to-end spatial grasp generation and refinement algorithm from simulation Peiyuan Ni1
· Wenguang Zhang1 · Xiaoxiao Zhu1 · Qixin Cao1
Received: 13 April 2020 / Revised: 27 July 2020 / Accepted: 8 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Novel object grasping is an important technology for robot manipulation in unstructured environments. For most of current works, a grasp sampling process is required to obtain grasp candidates, combined with a local feature extractor using deep learning. However, this pipeline is time–cost, especially when grasp points are sparse such as at the edge of a bowl. To tackle this problem, our algorithm takes the whole sparse point clouds as the input and requires no sampling or search process. Our work is combined with two steps. The first step is to predict poses, categories and scores (qualities) based on a SPH3D-GCN network. The second step is an iterative grasp pose refinement, which is to refine the best grasp generated in the first step. The whole weight sizes for these two steps are only about 0.81M and 0.52M, which takes about 73 ms for a whole prediction process including an iterative grasp pose refinement using a GeForce 840M GPU. Moreover, to generate training data of multi-object scene, a single-object dataset (79 objects from YCB object set, 23.7k grasps) and a multi-object dataset (20k point clouds with annotations and masks) combined with thin structures grasp planning are generated. Our experiment shows our work gets 76.67% success rate and 94.44% completion rate, which performs better than current state-of-the-art works. Keywords Robot learning · 3d deep learning · Object grasping · Robot vision
1 Introduction Object manipulation in unstructured environments of real world is still an open problem, especially for unseen objects. However, many high challenging problems still exist: (1) When objects are stacked in a pile, it is difficult and time— cost to search available grasps. (2) The camera data may be sparse and noisy, which is a challenging work to generate 3D Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00138-020-01127-9) contains supplementary material, which is available to authorized users.
B
Qixin Cao [email protected] Peiyuan Ni [email protected] Wenguang Zhang [email protected] Xiaoxiao Zhu [email protected]
1
State Key Lab of Mechanical Systems and Vibration, Intelligent Robot Laboratory of SJTU, Shanghai Jiao Tong University, Shanghai, China
spatial grasps. (3) An appropriate quality metrics should be considered to obtain a best grasp among all the grasp candidates. Grasping perception adopts different algorithms depending on the characteristics of the grasping scene [1]. Traditional model-based algorithm applies 6D pose estimation algorithm [2, 3] to obtain object poses and choose a best grasp from a pre-built grasp database. This database can be predefined manually or generated by other tools such as Graspit! [4]. However, th
Data Loading...