Interactively shaping robot behaviour with unlabeled human instructions

  • PDF / 2,148,611 Bytes
  • 35 Pages / 439.37 x 666.142 pts Page_size
  • 60 Downloads / 181 Views

DOWNLOAD

REPORT


(2020) 34:35

Interactively shaping robot behaviour with unlabeled human instructions Anis Najar1   · Olivier Sigaud2 · Mohamed Chetouani2

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In this paper, we propose a framework that enables a human teacher to shape a robot behaviour by interactively providing it with unlabeled instructions. We ground the meaning of instruction signals in the task-learning process, and use them simultaneously for guiding the latter. We implement our framework as a modular architecture, named TICS (Task-Instruction-Contingency-Shaping) that combines different information sources: a predefined reward function, human evaluative feedback and unlabeled instructions. This approach provides a novel perspective for robotic task learning that lies between Reinforcement Learning and Supervised Learning paradigms. We evaluate our framework both in simulation and with a real robot. The experimental results demonstrate the effectiveness of our framework in accelerating the task-learning process and in reducing the number of required teaching signals. Keywords  Interactive machine learning · Human–robot interaction · Shaping · Reinforcement learning · Unlabeled instructions

1 Introduction Over the last few years, substantial progress has been made in both machine learning [31, 32, 40] and robotics [10]. However, applying machine learning methods to real-world robotic tasks still raises several challenges. One important challenge is to reduce training time, as state-of-the-art machine learning algorithms still require millions of iterations for

This work was supported by the Romeo2 project. Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s1045​ 8-020-09459​-6) contains supplementary material, which is available to authorized users. * Anis Najar [email protected] 1

Laboratoire de Neurosciences Cognitives Computationnelles (LNC2), INSERM U960, Paris, France

2

Institute for Intelligent Systems and Robotics, CNRS UMR 7222, Sorbonne Université, Paris, France



13

Vol.:(0123456789)

35  

Page 2 of 35

Autonomous Agents and Multi-Agent Systems

(2020) 34:35

solving real-world problems [31, 32, 40]. Two complementary approaches for task learning in Robotics are usually considered: autonomous learning and interactive learning. Autonomous learning frameworks, such as Reinforcement Learning [22] or Evolutionary Approaches [9], rely on a predefined evaluation function that enables the robot to autonomously evaluate its performance on the task. The main advantage of this approach is the autonomy of the learning process. The evaluation function being integrated on board, the robot is able to optimize its behaviour without requiring help from a supervisor. However, when applied to real-world problems, this approach suffers from several limitations. First, designing an appropriate evaluation function can be difficult in practice [22]. Second, autonomous learning is based on autonomous exploration which results in slow converge