Incremental Learning for Autonomous Navigation of Mobile Robots based on Deep Reinforcement Learning
- PDF / 1,975,927 Bytes
- 11 Pages / 595.224 x 790.955 pts Page_size
- 107 Downloads / 258 Views
Incremental Learning for Autonomous Navigation of Mobile Robots based on Deep Reinforcement Learning Manh Luong1 · Cuong Pham2 Received: 25 February 2020 / Accepted: 17 September 2020 © Springer Nature B.V. 2020
Abstract This paper presents an incremental learning method and system for autonomous robot navigation. The range finder laser sensor and online deep reinforcement learning are utilized for generating the navigation policy, which is effective for avoiding obstacles along the robot’s trajectories as well as for robot’s reaching the destination. An empirical experiment is conducted under simulation and real-world settings. Under the simulation environment, the results show that the proposed method can generate a highly effective navigation policy (more than 90% accuracy) after only 150k training iterations. Moreover, our system has slightly outperformed deep-Q, while having considerably surpassed Proximal Policy Optimization, two recent state-of-the art robot navigation systems. Finally, two experiments are performed to demonstrate the feasibility and effectiveness of our robot’s proposed navigation system in real-time under real-world settings. Keywords Robot navigation · Reinforcement learning · Autonomous robot
1 Introduction Autonomous navigation is the fundamental framework for the mobile robot to perform a wide range of tasks. For instance, a simple action as grabbing a cup in the kitchen could be a big challenge for the robot to fulfill as the robot does not know where the kitchen is or how to go to the kitchen. Indeed, if the robot can autonomously navigate to the kitchen, it would straightforwardly grab the cup. Moreover, most robotic applications are adopted in indoor environments, thus the automatic operation is necessary for the indoor robot. Previously, the indoor navigation methods are used to navigate based on the global map achieving by SLAM [1] to generate an optimal path (either shortest path or collision free path). However, it can be unreliable for the global map over time due to the changes Cuong Pham
[email protected] Manh Luong [email protected] 1
VinAI Research, Hanoi, Vietnam
2
Faculty of Information Technology 1, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam
of the environment, e.g., furnitures are moved by human or dynamic obstacles. To deal with these issues, the global map needs to be updated in real time. However, it leads to costly and might get failed to update the new information for the environment. Recently, we have witnessed the domination of heuristic-based approaches for navigation task as those have demonstrated significantly outperformed map-based approach [2]. In contrast to the map-based approach, the heuristicbased approach is based on machine learning, commonly used in the navigation task such as navigation in minecraft games [3], in 3D scene simulation environments [4], and in crowed environments [5, 6]. Basically, the machine learning based navigation can be distinguished into categories: (1)supervised learning which learns the n
Data Loading...