Neural Networks for Computational Chemistry: Pitfalls and Recommendations

  • PDF / 546,377 Bytes
  • 6 Pages / 612 x 792 pts (letter) Page_size
  • 75 Downloads / 170 Views

DOWNLOAD

REPORT


Neural Networks for Computational Chemistry: Pitfalls and Recommendations Grégoire Montavon1 and Klaus-Robert Müller1,2 1. Machine Learning Group, TU Berlin Marchstraße 23, 10587 Berlin, Germany 2. Department of Brain and Cognitive Engineering, Korea University Anam-dong, Seongbuk-gu, Seoul 136-713, South Korea ABSTRACT There is a long history of using neural networks for function approximation in computational physics and chemistry. Despite their conceptual simplicity, the practitioner may face difficulties when it comes to putting them to work. This small guide intends to pinpoint some neural networks pitfalls, along with corresponding solutions to successfully realize function approximation tasks in physics, chemistry or other fields. INTRODUCTION Neural networks are powerful function approximators that hold great promise for speeding up quantum-chemical computations. The acceleration of atomistic simulations would enable the rapid screening of the chemical compound space with numerous applications to drug discovery, material design and beyond. Neural networks and other machine learning techniques have been applied successfully to various tasks such as modeling the potential energy surface of single systems [1-3], approximating density functionals [4], and predicting atomization energies across the chemical compound space [5,6]. In practice, one may, however, face various difficulties when trying to apply neural networks out-of-the-box to a given problem. In some cases, neural networks yield disappointing performance, in other cases, the training algorithm does not work at all. This small guide is an attempt to clarify some tricky problems that are likely to occur in computational chemistry applications, and that are not always well-documented in neural networks textbooks. THEORY Let f : ℝ p →ℝ be an unknown function that relates several real quantities of a physical system. For example, the input could be the vector of atomic charges (Z 1, Z 2, …) and Cartesian coordinates ( R⃗1, R⃗2, …) of each atom in a molecule, and the output its associated atomization energy. The machine learning setup is as follows: assuming the existence of such unknown function f that encompasses the calculations of the Schrödinger equation, we observe several realizations of it (by empirical testing or numerical simulation) and collect a dataset of inputoutput pairs {(x 1, y 1) ,…,(x n , y n)} (e.g. n molecules and their atomization energies). The goal is to learn a function from this dataset that is as similar to f as possible. The resulting function is generally much faster to evaluate than the original function f and is expected to generalize to data points outside the dataset (i.e. out-of-sample molecules). We first introduce the multilayer neural network and how to train it. Then, we demonstrate the importance of centering data and

hidden representations in the neural network, possibly, by expanding them into a more suitable higher-dimensional representation. We finally discuss how to scale the different quantities involved in th