Dynamic multifoveated structure for real-time vision tasks in robotic systems

  • PDF / 2,359,492 Bytes
  • 17 Pages / 595.276 x 790.866 pts Page_size
  • 81 Downloads / 163 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH PAPER

Dynamic multifoveated structure for real‑time vision tasks in robotic systems A tool for removing redundancy in multifoveated image processing Petrucio R. T. Medeiros1 · Rafael B. Gomes1 · Esteban W. G. Clua2 · Luiz Gonçalves1,2  Received: 10 October 2018 / Accepted: 25 June 2019 © The Author(s) 2019

Abstract Foveation is a technique that allows real-time image processing by drastically reducing the amount of visual data without loosing essential information around some focused area. When a robot needs to pay attention at two or more regions of the image at the same time, e.g., for tracking two or more objects, multifoveation is necessary. In this case, computing features twice in the intersections between the different foveated structures, which could linearly increase the processing time, must be avoided. To solve this redundancy removal problem, we propose two algorithms. The first one is based on the previous calculation of redundant blocks and the second one is based on a pixel-by-pixel processing at execution time. Experimental results show a gain in processing time for the block-based model in comparison with the pixel-by-pixel and also of both in comparison with other approaches that sequentially calculate various single foveated images. Robotics vision and other tasks related to dynamic visual attention, as recognition, real-time surveillance, video transmission, and image rendering, are examples of applications that can rely on and strongly benefit from such model. Keywords  Real time processing · Feature extraction · Multifoveated image

1 Introduction

This work is supported by CNPq and CAPES, Brazilian Sponsoring Agencies for Scientific Research and Superior Education Staff Improvement. * Luiz Gonçalves [email protected] Petrucio R. T. Medeiros [email protected] Rafael B. Gomes [email protected] Esteban W. G. Clua [email protected] 1



Federal University of Rio Grande do Norte, Av. Salgado Filho, 3000, Campus Universitário, Lagoa Nova, Natal, RN, Brazil



Fluminense Federal University, Av. Gal. Milton Tavares de Souza, W/N, 24.210‑310 Niteroi, RJ, Brazil

2

Visual data reduction and extraction of features for realtime applications is generally achieved by applying image preprocessing techniques. Reducing the amount of visual data while keeping essential information can be done using the technique known as foveation [1–13]. This technique provides reduction of 2D [8, 12] or 3D [14–16] data for facilitating further computations necessary for extraction of features, thus allowing the execution of visual tasks in realtime. Also known as multiresolution foveation, it is basically achieved by applying an image transformation from the spatial domain to obtain a dry structure in the multiresolution feature domain [3, 8, 14]. This structure maintains the maximum resolution possible in a small portion of the image, called the fovea (most inner level), and decreases image resolution in the periphery (outer levels), as the distance to the fovea increases, generally describ