Interactive task learning via embodied corrective feedback

PDF / 4,951,311 Bytes
45 Pages / 439.37 x 666.142 pts Page_size
0 Downloads / 233 Views

Interactive task learning via embodied corrective feedback Mattias Appelgren1 · Alex Lascarides1 Published online: 27 September 2020 © The Author(s) 2020

Abstract This paper addresses a task in Interactive Task Learning (Laird et al. IEEE Intell Syst 32:6–21, 2017). The agent must learn to build towers which are constrained by rules, and whenever the agent performs an action which violates a rule the teacher provides verbal corrective feedback: e.g. “No, red blocks should be on blue blocks”. The agent must learn to build rule compliant towers from these corrections and the context in which they were given. The agent is not only ignorant of the rules at the start of the learning process, but it also has a deficient domain model, which lacks the concepts in which the rules are expressed. Therefore an agent that takes advantage of the linguistic evidence must learn the denotations of neologisms and adapt its conceptualisation of the planning domain to incorporate those denotations. We show that by incorporating constraints on interpretation that are imposed by discourse coherence into the models for learning (Hobbs in On the coherence and structure of discourse, Stanford University, Stanford, 1985; Asher et al. in Logics of conversation, Cambridge University Press, Cambridge, 2003), an agent which utilizes linguistic evidence outperforms a strong baseline which does not. Keywords Human robot interaction · Interactive learning · Knowledge representation and reasoning

1 Introduction The nascent field of Interactive Task Learning (ITL) aims to develop agents that can learn arbitrary new tasks through a combination of their own actions in the environment and an ongoing interaction with a teacher (see Laird et al. [41] for a recent survey). A current assumption for many AI systems is that any capabilities required can be programmed and trained prior to deployment. However, this assumption may be untenable for tasks that contain a vast array of contingencies. It is also problematic if the task is one where unforeseen changes to what constitutes successful behaviour can occur after an agent is deployed: for * Mattias Appelgren [email protected] Alex Lascarides [email protected] 1

School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, Scotland, UK

13

Vol.:(0123456789)

54 Page 2 of 45

Autonomous Agents and Multi-Agent Systems (2020) 34:54

instance, tasks where the set of possible options, or the specifications that govern correct behaviour, can change at any given time. Motivated by such issues, ITL seeks to create agents that can learn after they are deployed, through situated interactions which are natural to the human domain expert that they interact with. Although interaction can take many forms, such as demonstration through imitation or teleoperation [6], our interest lies in approaches that make use of natural language to teach agents. A common formulation of such a learning process is as a situated and extended discourse between teacher and agent, much like one

Data Loading...

Interactive task learning via embodied corrective feedback

Recommend Documents

Oral Corrective Feedback Research

Written Corrective Feedback Research

Corrective Feedback, Individual Differences and Second Language Learning

Individual Differences and Corrective Feedback

Pedagogical Perspectives on Corrective Feedback

Theoretical Perspectives on Corrective Feedback

A Short History of Written Corrective Feedback

Comparing Oral and Written Corrective Feedback

New Directions in Formative Feedback in Interactive Learning Environments

Task-Agnostic Privacy-Preserving Representation Learning via Federated Learning

Written Corrective Feedback: The Role of Learner Engagement A Pr

Development of a Mobile Application for English Language Learning Through Corrective Feedback