Related Work

This chapter proposes an overview of the literature about the automatic human behavior analysis using streaming video. The discussion starts presenting the processing chain used to implement such kind of systems and then the attention is focused only on t

  • PDF / 1,109,115 Bytes
  • 29 Pages / 439.37 x 666.142 pts Page_size
  • 9 Downloads / 206 Views

DOWNLOAD

REPORT


Related Work

This chapter proposes an overview of the literature about the automatic human behavior analysis using streaming video. The discussion starts presenting the processing chain used to implement such kind of systems and then the attention is focused only on the works aiming at semantic analysis. The works presented in this chapter are classified in: scene interpretation, human recognition and action primitives and grammars. For each class a brief introductive description is provided and some relevant works are analyzed to give an idea of the proposed approaches and of the difficulties that they face. Finally, the chapter ends with a discussion about the state of the art in this field and with a brief overview of the specific challenges that this book is addressing.

3.1 Introduction Even though in literature there are many works on human behavior analysis and recognition, this is still an open research field. This is due to the inherent complexity of such task. Indeed, human behavior recognition can be seen as the vertex of a computational pyramid as shown in Fig. 3.1. Each level of this pyramid takes in input the output of the lower one and gives an output that can be used as input for the upper level or as a standalone application. The lowest level takes in input the raw video streams and gives in output a map of the image region where a moving object is detected. Climbing up this pyramid, the semantic level of performed tasks grows up. The processes at the lowest level work with moving region in a single frame while those at the second level work identifying objects in the same frame. At the third level a new parameter plays an important role: the time. Indeed, the processes at this level work associating the detected moving objects in the current frame with those in the previous one, providing temporal trajectories through the state space. The output of this level is sent to the human behavior analysis module.

A. Amato et al., Semantic Analysis and Understanding of Human Behavior in Video Streaming, DOI: 10.1007/978-1-4614-5486-1_3,  Springer Science+Business Media New York 2013

15

16 Fig. 3.1 A hierarchical overview of the computational chain for human behavior recognition

3 Related Work

Human behavior Object Tracking Object Recognition Motion detection

Each level of this processing pyramid has its own characteristics and difficulties often due the partial completeness of the used data (for example: the tentative to extract 3D data about moving objects working on 2D images). In the following paragraphs, a brief overview of the main issues related to each level is presented.

3.2 Motion Detection These algorithms find the moving areas computing the difference at pixel level between the current frame and a background model. This model can be a fixed frame (useful for indoor applications) or a complex model where each pixel is defined by a Gaussian probability distribution [31]. A good background model should be enough robust to handle rapid illumination changes in the scene and at the same time