PGCNet: patch graph convolutional network for point cloud segmentation of indoor scenes

  • PDF / 2,353,086 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 98 Downloads / 203 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

PGCNet: patch graph convolutional network for point cloud segmentation of indoor scenes Yuliang Sun1 · Yongwei Miao2

· Jiazhou Chen1 · Renato Pajarola3

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Semantic segmentation of 3D point clouds is a crucial task in scene understanding and is also fundamental to indoor scene applications such as indoor navigation, mobile robotics, augmented reality. Recently, deep learning frameworks have been successfully adopted to point clouds but are limited by the size of data. While most existing works focus on individual sampling points, we use surface patches as a more efficient representation and propose a novel indoor scene segmentation framework called patch graph convolution network (PGCNet). This framework treats patches as input graph nodes and subsequently aggregates neighboring node features by dynamic graph U-Net (DGU) module, which consists of dynamic edge convolution operation inside U-shaped encoder–decoder architecture. The DGU module dynamically update graph structures at each level to encode hierarchical edge features. Incorporating PGCNet, we can segment the input scene into two types, i.e., room layout and indoor objects, which is afterward utilized to carry out final rich semantic labeling of various indoor scenes. With considerable speedup training, the proposed framework achieves effective performance equivalent to state-of-the-art for segmenting standard indoor scene dataset. Keywords Point cloud · Scene segmentation · Surface patch · Graph convolutional network · Edge convolution · Encoder–decoder

1 Introduction 3D indoor scene understanding requires a thorough analysis on geometric and semantic context of interior scene. Indoor scene semantic segmentation, in which indoor objects are assigned with different labels, is a fundamental sub-task of scene understanding. Point cloud, which can be acquired directly by most depth scanning devices, is a common geometric representation in the literature of computer graphics and computer vision [1–4]. Point cloud segmentation of indoor scenes is now attracting growing attention because of its various applications such as virtual/augmented reality [5], mobile robotics [6], indoor navigation [7].

B

Yongwei Miao [email protected]

1

College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China

2

College of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou, China

3

Department of Informatics, University of Zurich, 8050 Zurich, Switzerland

Semantic segmentation of indoor scenes is still challenge due to incomplete raw inputs, the scale of point cloud data, cluttered and always heavily occluded settings such as realworld indoor environments. For effectively processing point clouds, the essential issue is how to effectively extract the feature information of point cloud scenes or 3D shapes. Conventionally, handcrafted features of point clouds are chosen to analyze 3D geometry but they are difficult to select for