First-person activity recognition from micro-action representations using convolutional neural networks and object flow

PDF / 2,392,572 Bytes
21 Pages / 439.642 x 666.49 pts Page_size
68 Downloads / 237 Views

First-person activity recognition from micro-action representations using convolutional neural networks and object ﬂow histograms Panagiotis Giannakeris1 · Panagiotis C. Petrantonakis1 · Konstantinos Avgerinakis1 · Stefanos Vrochidis1 · Ioannis Kompatsiaris1 Received: 20 March 2019 / Revised: 16 August 2020 / Accepted: 16 September 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract A novel first-person human activity recognition framework is proposed in this work. Our proposed methodology is inspired by the central role moving objects have in egocentric activity videos. Using a Deep Convolutional Neural Network we detect objects and develop discriminant object flow histograms in order to represent fine-grained micro-actions during short temporal windows. Our framework is based on the assumption that large scale activities are synthesized by fine-grained micro-actions. We gather all the micro-actions and perform Gaussian Mixture Model clusterization, so as to build a micro-action vocabulary that is later used in a Fisher encoding schema. Results show that our method can reach 60% recognition rate on the benchmark ADL dataset. The capabilities of the proposed framework are also showcased by profoundly evaluating for a great deal of hyper-parameters and comparing to other State-of-the-Art works. Keywords Activity recognition · Object detection · Egocentric vision · Ambient assisted living

Panagiotis Giannakeris

[email protected] Panagiotis C. Petrantonaki [email protected] Konstantinos Avgerinakis [email protected] Stefanos Vrochidis [email protected] Ioannis Kompatsiaris [email protected] 1

ITI-CERTH, Thermi, Greece

Multimedia Tools and Applications

1 Introduction The continuous rise of the video format as a medium for communication has brought a digital video revolution to the modern connected world. It is safe to say that is has now surpassed the popularity of image and text formats judging by the countless online multimedia platforms that support it and the amount of video clips the web pages are filled with daily. The use cases are endless: from do-it-yourself tutorials, to marketing and live event broadcasting that are uploaded online, many popular public video repositories contain massive amounts of video content. It is not only the attractive combination of auditory and visual content that is making the medium popular, but also the technology of modern wearables that push seemingly every single person to carry a tiny video camera at all times, plus the convenient ways that exist for the videos to end up posted online for immediate consumption on social media. In most of the videos uploaded online, humans are the center of attention and the thematic content is in one way or another moving around the activities that they perform. Multimedia processing and computer vision researchers have shown much interest in the exploitation of those huge databases. The proposed solutions can address the needs of several real life applications, such as video surveillance and security application

Data Loading...

First-person activity recognition from micro-action representations using convolutional neural networks and object flow

Recommend Documents

Object Detection with Convolutional Neural Networks

3D Object Recognition Based on Volumetric Representation Using Convolutional Neural Networks

Transition activity recognition using fuzzy logic and overlapped sliding window-based convolutional neural networks

Convolutional Neural Networks for Traffic Signs Recognition

Vietnamese Food Recognition System Using Convolutional Neural Networks Based Features

Transferring and Compressing Convolutional Neural Networks for Face Representations

Detecting New Events from Microblogs Using Convolutional Neural Networks

License Plate Detection and Recognition by Convolutional Neural Networks

Age and Gender Recognition from Speech Using Deep Neural Networks

Face Recognition Based on Harris Detector and Convolutional Neural Networks

Convolutional Neural Networks and Periocular Region Image Recognition

Biometric Authentication Using Convolutional Neural Networks