1000 Fps Highly Accurate Eye Detection with Stacked Denoising Autoencoder

Eye detection is an important step for a range of applications such as iris and face recognition. For eye detection in practice, speed is as equally important as accuracy. In this paper, we propose a super-fast (1000 fps on a general PC) eye detection met

  • PDF / 1,338,366 Bytes
  • 10 Pages / 439.37 x 666.142 pts Page_size
  • 78 Downloads / 208 Views

DOWNLOAD

REPORT


Abstract. Eye detection is an important step for a range of applications such as iris and face recognition. For eye detection in practice, speed is as equally important as accuracy. In this paper, we propose a super-fast (1000 fps on a general PC) eye detection method based on the label map of the raw image without face detection. We firstly produce the label map of a raw image according to the coordinates of its bounding box . Then we train a stacked denoising autoencoder (SDAE) which is specifically designed to learn the mapping from the raw image to the label map. Finally, through an effective post-processing step, we obtain the bounding boxes of two eyes. Experimental results show that our method is about 2,500 times faster than the deformable part-based model (DPM) while maintaining a comparable accuracy. Also, our method is much better than the popular LBP+Cascade model in terms of both accuracy and speed. Keywords: Eye detection

1

· Autoencoder · Label map

Introduction

As a challenging problem in computer vision, eye detection has attracted increasing attention in recent years due to its importance in some real applications such as iris and face recognition. Eye detection aims to solve the problem of getting the accurate position of eyes in a given image. Great achievements have been made on the accuracy of object detection over the past years [2] [3] [7] and these methods could be directly utilized to eye detection. However, when facing truly practical problems, we find that few methods can run at a fast speed and keep a high accuracy at the same time. On one hand, despite of the great accuracy achieved by many recently proposed methods such as DPM [2] and RCNN [3], they usually rely on tools of high performance computing (HPC) for the demand of real-time detection. Sometimes, even though the HPC technology is adopted, the speed still cannot meet the requirements in applications such as on the embedded devices. On the other hand, traditional methods like LBP+Cascade can run in real time, but their detection accuracy is usually not satisfactory. c Springer-Verlag Berlin Heidelberg 2015  H. Zha et al. (Eds.): CCCV 2015, Part II, CCIS 547, pp. 237–246, 2015. DOI: 10.1007/978-3-662-48570-5 23

238

W. Tang et al.

In this paper, we propose a novel method based on the label map to address fast and accurate eye detection. To obtain great acceleration, we adopt SDAE [14] to learn the mapping from raw image to label map image, which can be very fast in testing because SDAE needs only a few times of matrix multiplication. Label map has been proposed in [8] for face parsing. Our method differs from that in two aspects. Firstly, the method in [8] deals with the face parsing problem, so the label map needs segmentation for each pixel. However, our task is specific object detection and the label map with the location of the bounding box is enough, which means traditional object detection datasets can be directly utilized to train our model. Secondly, face parsing in [8] needs the face detection results as the input. In

Data Loading...