Machine learning from a continuous viewpoint, I

PDF / 541,970 Bytes
34 Pages / 612 x 792 pts (letter) Page_size
18 Downloads / 282 Views

. ARTICLES .

https://doi.org/10.1007/s11425-020-1773-8

Machine learning from a continuous viewpoint, I Weinan E1,2,3,∗ , Chao Ma2 & Lei Wu2

2The

1Department of Mathematics, Princeton University, Princeton, NJ 08544, USA; Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA; 3Beijing Institute of Big Data Research, Beijing 100871, China

Email: [email protected], [email protected], [email protected] Received May 27, 2020; accepted August 28, 2020

Abstract

We present a continuous formulation of machine learning, as a problem in the calculus of variations

and differential-integral equations, in the spirit of classical numerical analysis. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural network model, can all be recovered (in a scaled form) as particular discretizations of different continuous formulations. We also present examples of new models, such as the flow-based random feature model, and new algorithms, such as the smoothed particle method and spectral method, that arise naturally from this continuous formulation. We discuss how the issues of generalization error and implicit regularization can be studied under this framework. Keywords MSC(2010)

machine learning, continuous formulation, flow-based model, gradient flow, particle approximation 41A99, 49M99

Citation: E W, Ma C, Wu L. Machine learning from a continuous viewpoint, I. Sci China Math, 2020, 63, https://doi.org/10.1007/s11425-020-1773-8

1

Introduction

We present a continuous formulation of machine learning. As usual this continuous formulation consists of three components: a representation of functions, a loss functional and a training dynamics. For representations of functions, we will discuss the integral transform-based models and the more advanced ﬂow-based models. For the loss functional, we give examples that arise in supervised and unsupervised learning, as well as examples from calculus of variations and partial diﬀerential equations (PDEs). For training dynamics, we divide the unknown parameters into two classes: conserved and non-conserved. For non-conserved parameters, we use what is known in the physics literature as the model A dynamics [39], namely gradient ﬂow in the usual L2 metric. For conserved parameters, we use what is known as the model B dynamics [39], namely the gradient ﬂow in the Wasserstein metric [41]. In this framework, machine learning becomes a calculus of variations or PDE-like problem, and diﬀerent numerical algorithms can be used to discretize these continuous models. In particular, two-layer neural network [6, 18] and deep residual neural network (ResNet) models [24, 36] can be recovered, in a scaled form, when the particle method is applied to particular versions of the integral transform-based and * Corresponding author c Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2020 ⃝

math.scichina.com

link.spring

Data Loading...

Machine learning from a continuous viewpoint, I

Recommend Documents

From a Data Science Driven Process to a Continuous Delivery Process for Machine Learning Systems

Orbifolds from a metric viewpoint

Learning Markerless Human Pose Estimation from Multiple Viewpoint Video

Continuous Learning

Machine Learning and Data Mining in Pattern Recognition Second I

A machine learning framework to determine geolocations from metagenomic profiling

Multivariate time series analysis from a Bayesian machine learning perspective

Statistical submanifolds from a viewpoint of the Euler inequality

Analysis of liquid metal embrittlement from a bond energy viewpoint

Machine Learning

Machine Learning

Machine-Learning