Gesture Recognition Based on Kinect

In recent years, depth cameras have become a widely available sensor type that captures depth images at real-time frame rates. For example, Microsoft KINECT is a powerful but cheap device to get depth images. Even though recent approaches have shown that

  • PDF / 1,754,818 Bytes
  • 7 Pages / 439.37 x 666.142 pts Page_size
  • 18 Downloads / 209 Views

DOWNLOAD

REPORT


Abstract In recent years, depth cameras have become a widely available sensor type that captures depth images at real-time frame rates. For example, Microsoft KINECT is a powerful but cheap device to get depth images. Even though recent approaches have shown that 3D pose estimation and recognition from monocular 2.5D depth images has become feasible, there are still some challenge problems like gesture detection and recognition. In this paper, we propose a gesture recognition method and use that to make a puzzle game with kinesthetic system. Gesture is very important for our system because instead of using some devices like keyboard or mouse, users will play the puzzle game by their own hand. We will focus on using ROI and some algorithms that we proposed to do gesture detection and recognition. Keywords KINECT

 Depth cameras  Pose estimation  Gesture detection

Introduction Kinesthetic system is the interactive media system that has been known recently. Because of the immediateness the users gain more feedback during the operation process. By utilizing this characteristic, we integrate this system with teaching resources and make it a digital interactive platform which boost the concentration and interest of students. Hence, it becomes an excellent teaching medium. This research uses Microsoft KINECT cameras and its developed software as the foundation of kinesthetic system. KINECT cameras derive three kinds of data, which are colorful images, 3D image depth information, and audio sources. There C.-H. Chuang (&)  Y.-N. Chen  M.-S. Deng  K.-C. Fan Department of Computer Engineering and EntertainmentTechnology, Tajen University, Pingtung, Taiwan, Republic of China e-mail: [email protected]

Y.-M. Huang et al. (eds.), Advanced Technologies, Embedded and Multimedia 1123 for Human-centric Computing, Lecture Notes in Electrical Engineering 260, DOI: 10.1007/978-94-007-7262-5_128, Ó Springer Science+Business Media Dordrecht 2014

1124

C.-H. Chuang et al.

are three cameras in KINECT: the middle one is common RGB color camera and the others are infrared launcher and 3D depth sensor imaged by infrared CMOS cameras. The data sources derived by this system detect the users’ motions mainly by 3D depth sensor. KINECT utilizes the technology of Light Coding to reach the image detecting and tracking. Light Coding [1] is a kind of technology adopted as a way to process depth information of image. This theory used coding of the measured space by Continuous light (close to infrared ray) and decodes by chips to result an image with depth. After understanding how KINECT gets images, the next step is how to process the job of recognition. The data derived from Light Coding technology are basic image information. The key point is to recognize images and transfer into order of action. KINECT can transfer 3D depth image into system of Skeleton tracking. Shotton et al. [2] is the way to integrate the colorful images with depth image to find out body node and the skeleton. Girshick et al. [3] is the procedure of Regression