Deep Q-Learning Algorithm for Solving Inverse Kinematics of Four-Link Manipulator

This paper presents deep Q-learning algorithm designed to solve inverse kinematics problem of four-link manipulator. This algorithm uses dynamic exploration coefficient instead of a constant value, which allows to avoid convergence of the neural network t

  • PDF / 442,582 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 53 Downloads / 216 Views

DOWNLOAD

REPORT


Deep Q-Learning Algorithm for Solving Inverse Kinematics of Four-Link Manipulator Dmitriy Blinov , Anton Saveliev , and Aleksandra Shabanova

Abstract This paper presents deep Q-learning algorithm designed to solve inverse kinematics problem of four-link manipulator. This algorithm uses dynamic exploration coefficient instead of a constant value, which allows to avoid convergence of the neural network to a local optimum. In addition, a method for generating a Q-table has been developed to avoid the bottleneck effect when neural network construction. This in turn leads to reduction of training time and lower hardware requirements. To evaluate the effectiveness of the proposed algorithm, three environments were developed and for each of them specific neural networks model were used. Three different environments allow to evaluate the algorithm performance for solving inverse kinematics of varying complexity: with one initial and one target points, with several initial and one target points, and, conversely, with one initial and several target points. Obtained dependency graph of rewards on the number of training episodes shown successful training of agents in all environments. Successful training of the Q-learning algorithm in the third environment suggests that the algorithm can be used for solving the inverse kinematics for all points of the manipulator working space. The main advantage of the developed algorithm is the possibility of its application for solving inverse kinematics problems of varying complexity. In addition, this algorithm can be used to solve inverse kinematics of manipulator with a different number of links.

D. Blinov St. Petersburg State University of Aerospace Instrumentation, 67, st. Bolshaya Morskaya, 190000 St. Petersburg, Russia e-mail: [email protected] A. Saveliev · A. Shabanova (B) St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 39, 14th Line, 199178 St. Petersburg, Russia e-mail: [email protected] A. Saveliev e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 A. Ronzhin and V. Shishlakov (eds.), Proceedings of 15th International Conference on Electromechanics and Robotics “Zavalishin’s Readings”, Smart Innovation, Systems and Technologies 187, https://doi.org/10.1007/978-981-15-5580-0_23

279

280

D. Blinov et al.

23.1 Introduction Solving inverse kinematics is an important prerequisite for ensuring a robot control [1]. The most popular ways to solve this problem are analytical and geometric methods. Currently, there are various methods for solving inverse kinematics. The reliability of a control system by a robotic means largely depends on the choice of these methods. Today, the most popular methods are the analytical (inverse transformation method) and geometric proposed in [2, 3]. However, these methods have a significant drawback: When increasing the number of manipulator links, it is necessary to introduce restrictions. This lead