Parallel Sequential Random Embedding Bayesian Optimization
- PDF / 7,105,431 Bytes
- 15 Pages / 595.276 x 790.866 pts Page_size
- 26 Downloads / 273 Views
ORIGINAL RESEARCH
Parallel Sequential Random Embedding Bayesian Optimization Noriko Yokoyama1 · Masahiro Kohjima1 · Tatsushi Matsubayashi1 · Hiroyuki Toda1 Received: 31 March 2020 / Accepted: 23 October 2020 © Springer Nature Singapore Pte Ltd 2020
Abstract Bayesian optimization, which offers efficient parameter search, suffers from high computation cost if the parameters have high dimensionality because the search space expands and more trials are needed. One existing solution is an embedding method that enables the search to be restricted to a low-dimensional subspace, but this method works well only when the number of embedding dimensions closely matches the number of effective dimensions, which affects the function value. However, in practical situations, the number of effective dimensions is unknown, and using a low dimensional subspace to lower computation costs often results in less effective searches. This study proposes a Bayesian optimization method that uses random embedding that remains efficient even if the embedded dimension is lower than the effective dimensions. By conducting parallel search in an initially low dimensional space and performing multiple cycles in which the search space is incrementally improved, the optimum solution can be efficiently found. The proposed method is challenged in experiments on benchmark problems, the results of which confirm its effectiveness. Keywords Expensive black-box optimization · Bayesian optimization · Random embedding
Introduction Black-box optimization schemes have been proposed to tackle various machine learning and artificial intelligence problems. This is because the input-output relation of the objective function, e.g., hyperparameters and the classification performance of deep neural nets [1], internal parameters of the multi agent simulator and data assimilation accuracy [2], is hidden or too complicated to clarify; the common solution is to treat the function as a black-box function. Therefore, methods that support black-box objective functions are widely used and Bayesian optimization (hereinafter referred to as BO) is the dominant black-box optimization method [1, 3, 4]. BO is a method of finding, given observed data, the optimum value (maximum value or minimum value) of a This article is part of the topical collection “Machine Learning in Pattern Analysis” guest edited by Reinhard Klette, Brendan McCane, Gabriella Sanniti di Baja, Palaiahnakote Shivakumara and Liang Wang. * Noriko Yokoyama [email protected] 1
NTT Service Evolution Laboratories, NTT Corporation, 1‑1 Hikarinooka Yokosuka, Kanagawa 239‑0847, Japan
black-box objective function based on the posterior distribution of the function. The function is usually modeled by Gaussian processes [5] (GP) and the posterior distribution is constructed using matrix inversion. It is known that BO can more efficiently find the optimum values than random search since it determines the next search point based on the posterior. However, standard BO suffers from low estimation accurac
Data Loading...