A New Approach for Multi-agent Reinforcement Learning

The classical approach of reinforcement learning for single agent is based on the concept of reward that comes only from the environment. By trial-and-error, the agent has to learn to maximize its total accumulated reward. Several algorithms and technique

  • PDF / 421,403 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 90 Downloads / 207 Views

DOWNLOAD

REPORT


Abstract The classical approach of reinforcement learning for single agent is based on the concept of reward that comes only from the environment. By trial-and-error, the agent has to learn to maximize its total accumulated reward. Several algorithms and techniques were developed for a single agent reinforcement learning. Our purpose is to benefit from all done work in reinforcement learning of an agent and extend it to multi-agent system. we have proposed a new approach that is based on the following idea: Communication or cooperation of a team can be achieved through mutual reinforcement of agents, that means agents can give and receive rewards from other agents and not only from the environment. The goal of each agent is to maximize the global accumulated reward received from environment and from other agents. We treat the multi-agent system as a unique entity. We define, state, action of the system and the value of each state with respect to the global policy of the system. Keywords Artificial intelligence · Reinforcement learning · Multi-agent system · Markov decision process · Markov game · Control theory

1 Introduction Multi-agent reinforcement learning is a difficult problem in artificial intelligence with a growing interest in theoretical research and practical applications. In 1993, Ming Tan demonstrated that agents learn not only by trial and error but also by cooperation by sharing instant information [1]. In 2000, Y. Nagayuki; S. Ishii and K. Doya E. Amhraoui (B) Artificial Intelligence for Engineering Sciences Team (IASI), Doctoral Studies Center, ENSAM-Meknes, University My Ismail, Meknes, Morocco e-mail: [email protected] T. Masrour Department of Mathematics and Informatics, Artificial Intelligence for Engineering Sciences Team (IASI), ENSAM-Meknes, University My Ismail, Meknes, Morocco e-mail: [email protected] © Springer Nature Switzerland AG 2021 T. Masrour et al. (eds.), Artificial Intelligence and Industrial Applications, Advances in Intelligent Systems and Computing 1193, https://doi.org/10.1007/978-3-030-51186-9_19

263

264

E. Amhraoui and T. Masrour

have proposed a multi-agent reinforcement learning method based on predicting the actions of other agents [2]. However, most research on reinforcement learning has focused on single agents. In fact, several approaches and algorithms exist for learning by reinforcement of a single agent and the most famous approach is Q-reinforcement learning [3, 4]. In this article, we have proposed a consistent model of multi-agents system and also extend all existing techniques and algorithms in the reinforcement learning of an agent to the reinforcement learning of multi-agent system. Our model is in principle based on the idea of distributing reward between environment and other agents, that means that the reward of each agent is equal to the reward coming from the environment plus the reward coming from other agents. One of the greatest advantages of our model is maintaining the general outlines of Markov game concept, moreover no externa