Deep reinforcement learning: a survey

  • PDF / 671,906 Bytes
  • 19 Pages / 595.276 x 841.89 pts (A4) Page_size
  • 111 Downloads / 265 Views

DOWNLOAD

REPORT


1

Frontiers of Information Technology & Electronic Engineering www.jzus.zju.edu.cn; engineering.cae.cn; www.springerlink.com ISSN 2095-9184 (print); ISSN 2095-9230 (online) E-mail: [email protected]

Review:

Deep reinforcement learning: a survey∗ Hao-nan WANG‡ , Ning LIU, Yi-yun ZHANG, Da-wei FENG, Feng HUANG, Dong-sheng LI, Yi-ming ZHANG Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410000, China E-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected] Received Sept. 29, 2019; Revision accepted Mar. 30, 2020; Crosschecked June 4, 2020

Abstract: Deep reinforcement learning (RL) has become one of the most popular topics in artificial intelligence research. It has been widely used in various fields, such as end-to-end control, robotic control, recommendation systems, and natural language dialogue systems. In this survey, we systematically categorize the deep RL algorithms and applications, and provide a detailed review over existing deep RL algorithms by dividing them into modelbased methods, model-free methods, and advanced RL methods. We thoroughly analyze the advances including exploration, inverse RL, and transfer RL. Finally, we outline the current representative applications, and analyze four open problems for future research. Key words: Reinforcement learning; Deep reinforcement learning; Reinforcement learning applications https://doi.org/10.1631/FITEE.1900533 CLC number: TP18

1 Introduction With the combination of deep learning and big data, revolutionary advances have occurred in artificial intelligence research. There is growing interest to explore new technologies in the field of post-deep learning. Deep reinforcement learning (RL), which uses neural network modeling in traditional RL algorithms, is particularly attractive. Specifically, deep RL is used to solve decision optimization problems, and decides which action to perform to maximize the benefit in the face of a specific state. As a result, both the academic community and industry are paying much attention to analyzing and applying deep RL. Deep RL is a general paradigm which combines ‡ *

Corresponding author

Project supported by the National Natural Science Foundation of China (Nos. 61772541, 61872376, and 61932001) ORCID: Hao-nan WANG, https://orcid.org/0000-0002-07923858 c Zhejiang University and Springer-Verlag GmbH Germany, part  of Springer Nature 2020

RL and deep learning and has achieved success in a variety of scenarios, such as chess, investment, driving, and action imitation. Deep RL is thought as one of the closest things that look anything like artificial general intelligence (AGI). The processing and analysis differences between deep RL and traditional machine learning are huge. The current mainstream machine learning paradigm mostly collects or constructs dataset tags in advance, and performs machine learning based on existing static data. By con