Missile-Target Situation Assessment Model Based on Reinforcement Learning
- PDF / 729,139 Bytes
- 8 Pages / 612.284 x 810.709 pts Page_size
- 3 Downloads / 224 Views
Missile-Target Situation Assessment Model Based on Reinforcement Learning ZHANG Yun (
),
¨ Runyan ( LU
),
CAI Yunze ∗ (
)
(Department of Automation; Key Laboratory of System Control and Information Processing of Ministry of Education; Key Laboratory of Marine Intelligent Equipment and System of Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China)
© Shanghai Jiao Tong University and Springer-Verlag GmbH Germany, part of Springer Nature 2020 Abstract: In situation assessment (SA) of missile versus target fighter, the traditional SA models generally have the characteristics of strong subjectivity and poor dynamic adaptability. This paper considers SA as an expectation of future returns and establishes a missile-target simulation battle model. The actor-critic (AC) algorithm in reinforcement learning (RL) is used to train the evaluation network, and a missile-target SA model is established in simulation battle training. Simulation and comparative experiments show that the model can effectively estimate the expected effect of missile attack under the current situation, and it provides an effective basis for missile attack decision. Key words: situation assessment (SA), battle model, reinforcement learning (RL), actor-critic (AC) algorithm CLC number: TP 391, TP 18 Document code: A
0 Introduction As a kind of high-tech weapon, missiles have great lethality and high deterrence, and play an extremely important role in air combat. A key aspect of intelligent missile technology is situation assessment (SA). As defined by Endsley[1] , SA is seen as the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future. In the Joint Directors of Laboratories (JDL) data fusion model[2] , SA and threat estimation are used as the second and third steps of the information fusion model to provide necessary support for subsequent mission decisionmaking. In previous research results, the SA models are divided into parameter method and non-parameter method. Parameter method mainly includes superiority function models[3] , analytic hierarchy process[4] and fuzzy evaluation[5] . A typical model of non-parameter method is Bayesian network (BN)[6] . However, these models cannot give a realistic definition of situation, and generally have the disadvantages of over-reliance Received date: 2020-07-14 Foundation item: the National Natural Science Foundation of China (No. 61627810), the Joint Fund of Advanced Aerospace Manufacturing Technology Research of China (No. USCAST2016), and the National Key Research and Development Program of China (No. 2018YFB1305003) ∗E-mail: [email protected]
on subjectivity and poor dynamic adaptability. The technology of artificial intelligence, especially neural network and reinforcement learning (RL) has achieved many results in gaming environment, such as e-sports games[7] . Neural network has a powerful ability of nonlinear representation, and it has been applied to SA[8] . Howev
Data Loading...