3D Object Recognition Based on Volumetric Representation Using Convolutional Neural Networks

Following the success of Convolutional Neural Networks on object recognition and image classification using 2D images; in this work the framework has been extended to process 3D data. However, many current systems require huge amount of computation cost f

PDF / 1,457,200 Bytes
10 Pages / 439.37 x 666.142 pts Page_size
85 Downloads / 276 Views

DOWNLOAD

REPORT

Movidius Ltd., 1st Floor, O’Connell Br House, D’Olier St, Dublin, Ireland {xiaofan.xu,alireza.dehghani,sam.caulfield,david.moloney}@movidius.com 2 Trinity College Dublin, College Green, Dublin, Ireland corrigad@tcd.ie

Abstract. Following the success of Convolutional Neural Networks on object recognition and image classiﬁcation using 2D images; in this work the framework has been extended to process 3D data. However, many current systems require huge amount of computation cost for dealing with large amount of data. In this work, we introduce an eﬃcient 3D volumetric representation for training and testing CNNs and we also build several datasets based on the volumetric representation of 3D digits, different rotations along the x, y and z axis are also taken into account. Unlike the normal volumetric representation, our datasets are much less memory usage. Finally, we introduce a model based on the combination of CNN models, the structure of the model is based on the classical LeNet. The accuracy result achieved is beyond the state of art and it can classify a 3D digit in around 9 ms. Keywords: 3D object recognition · Volumetric representation · 3D digit dataset · CNN

1

Introduction

Object Recognition (OR) is widely used in our daily life for the purposes of inspection, registration, and manipulation [1]. The well-known applications such as Google, Facebook and Baidu are probably the most famous websites which use OR on a large scale. Generally, object classiﬁcation is performed using colour based segmentation methods or from grayscale images using classiﬁcation methods such as HoG [2]/SVM [3] or other classiﬁers. However, nowadays deep learning is becoming ubiquitous. We are now able to solve some of the problems once considered impossible in ﬁelds such as computer vision, natural language processing, and robotics with recent advancements in deep learning algorithms. Machine learning techniques use data (images, signals, text) to train a model (or machine) to perform image classiﬁcation or object detection. Although classical machine learning techniques are still being used to solve challenging image c Springer International Publishing Switzerland 2016 F.J. Perales and J. Kittler (Eds.): AMDO 2016, LNCS 9756, pp. 147–156, 2016. DOI: 10.1007/978-3-319-41778-3 15

148

X. Xu et al.

classiﬁcation problems, they don’t work well when applied directly to images. This is because they ignore the structure and compositional nature of images. The state-of-art CNN technique which is a speciﬁc type of deep learning algorithm, addresses the gaps in traditional machine learning techniques. CNNs not only perform classiﬁcation, but they can also learn to extract features directly from raw images which eliminates the need for manual feature extraction. The performance of deep learning which uses CNNs has rapidly grown to over 95 % accuracy (GoogLeNet [4], VGG [5], AlexNet [6] etc.) in recent years with the availability of large labelled datasets and powerful GPUs. These methods, while appealing from an accuracy standpoint, are

Data Loading...

3D Object Recognition Based on Volumetric Representation Using Convolutional Neural Networks

Recommend Documents

Vietnamese Food Recognition System Using Convolutional Neural Networks Based Features

Multimodal Biometric Recognition Based on Convolutional Neural Networks

Face Recognition Based on Harris Detector and Convolutional Neural Networks

Object Detection with Convolutional Neural Networks

First-person activity recognition from micro-action representations using convolutional neural networks and object flow

Dynamic Hand Gesture Recognition Using 3D-Convolutional Neural Network

Convolutional Neural Networks for Traffic Signs Recognition

3D Convolutional Neural Networks for Ultrasound-Based Silent Speech Interfaces

3D Dynamic Hand Gestures Recognition Using the Leap Motion Sensor and Convolutional Neural Networks

Static facial expression recognition using convolutional neural networks based on transfer learning and hyperparameter o

EEG-based emotion recognition using 4D convolutional recurrent neural network

Entity-Based Short Text Classification Using Convolutional Neural Networks