A Proposal on Machine Learning via Dynamical Systems
- PDF / 384,251 Bytes
- 11 Pages / 439.37 x 666.142 pts Page_size
- 17 Downloads / 235 Views
Dedicated to Professor Chi-Wang Shu on the occasion of his 60th birthday
A Proposal on Machine Learning via Dynamical Systems Weinan E1,2,3 Received: 7 February 2017 / Revised: 21 February 2017 / Accepted: 24 February 2017 / Published online: 22 March 2017 © School of Mathematical Sciences, University of Science and Technology of China and Springer-Verlag Berlin Heidelberg 2017
Abstract We discuss the idea of using continuous dynamical systems to model general high-dimensional nonlinear functions used in machine learning. We also discuss the connection with deep learning. Keywords Deep learning · Machine learning · Dynamical systems Mathematics Subject Classification 37N99 The number one task in machine learning is to efficiently create a sufficiently rich class of functions that can represent the data with the desired accuracy. The most straightforward approach is that of approximation theory: One starts with linear functions and then builds nonlinear ones using splines, wavelets or other basis functions [1]. The obvious obstacle with this approach is the curse of dimensionality. To deal with this issue, one often has to make simplified “ansatz,” by postulating special forms for the functions, for example an additive form or multiplicative form [2]. In recent years, a new class of techniques has shown remarkable success, the deep neural network model [3]. Neural network is an old idea, but recent experience has shown that deep networks with many layers seem to do a surprisingly good job in modeling complicated data sets. The difference between neural networks and traditional approximation theory is that neural networks use compositions of simple functions to approximate complicated ones, i.e., the neural network approach is compositional,
B
Weinan E [email protected]
1
Beijing Institute of Big Data Research (BIBDR), Beijing, China
2
Department of Mathematics and PACM, Princeton University, Princeton, NJ, USA
3
Center for Data Science and BICMR, Peking University, Beijing, China
123
2
W. E
whereas classical approximation theory is usually additive. Although we still lack a theoretical framework for understanding deep neural networks, their practical success has been very encouraging. In this note, we go one step further, by exploring the possibility of producing nonlinear functions using continuous dynamical systems, pushing the compositional approach to an infinitesimal limit. In the framework of supervised learning, this gives rise to a new class of control problems. In this view, the deep neural networks can be thought of as being discrete dynamical systems. Compared with deep neural networks, there are several potential advantages with a continuous approach. 1. Mathematically it is easier to think about and deal with continuous dynamical systems. Continuous formulation offers more flexibility (for example adding constraints, adapting the dynamical system to the problem, imposing structure on the dynamical system), and it is easier to analyze continuous dynamical systems than discrete ones.
Data Loading...