An FPGA-Based People Detection System
- PDF / 1,247,829 Bytes
- 15 Pages / 600 x 792 pts Page_size
- 60 Downloads / 250 Views
An FPGA-Based People Detection System Vinod Nair Centre for Intelligent Machines, McGill University, Montreal, QC, Canada H3A 2A7 Email: [email protected]
Pierre-Olivier Laprise Centre for Intelligent Machines, McGill University, Montreal, QC, Canada H3A 2A7 Email: [email protected]
James J. Clark Centre for Intelligent Machines, McGill University, Montreal, QC, Canada H3A 2A7 Email: [email protected] Received 15 September 2003; Revised 12 August 2004 This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about 2.5 frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at 75 MHz, communicating with dedicated hardware over FSL links. Keywords and phrases: computer vision, FPGA, people detection, smart camera.
1.
INTRODUCTION
This paper describes a system for detecting people in images, implemented on a field-programmable gate array (FPGA). People detection is an important subtask in many computer vision applications, such as automated video surveillance, human activity recognition, and smart room systems. The output of a people detector can be used, for instance, to infer a person’s location in a scene or to track the person over time. Such location and tracking data can then be analyzed to automatically generate a human-understandable description of what the person might be doing, or raise an alarm if the person’s behavior seems unusual. Many vision applications often involve a large number of cameras. For example, wide-area surveillance networks use tens to hundreds of cameras to monitor many different scenes. Sending the video from all the cameras to a single central workstation for processing can be prohibitively expensive because of the need for high-bandwidth transmission. An attractive alternative is to perform the processing on
the camera itself with a fast and inexpensive FPGA chip. In recent years, FPGA technology has become increasingly powerful, less expensive, and more practical for use in real-time vision applications. Our long-term goal is to build a framework in which a large number of cameras cooperate to carry out a collective ta
Data Loading...