Batch Bayesian optimization via adaptive local search
- PDF / 1,802,864 Bytes
- 16 Pages / 595.224 x 790.955 pts Page_size
- 35 Downloads / 253 Views
Batch Bayesian optimization via adaptive local search Jingfei Liu1 · Chao Jiang1
· Jing Zheng1
© Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Bayesian optimization (BO) provides an efficient tool for solving the black-box global optimization problems. Under situations where multiple points can be evaluated simultaneously, batch Bayesian optimization has been a popular extension by taking full use of the computational and experimental resources. In this paper, an adaptive local search strategy is investigated to select batch points for Bayesian optimization. First, multi-start strategy and gradient-based optimization method are combined to maximize the acquisition function. Then, an automatic cluster approach (e.g., X-means) is applied to adaptively identify the acquisition function’s local maxima from the gradient-based optimization results. Third, the Bayesian stopping criterion is utilized to guarantee all the local maxima can be obtained theoretically. Moreover, the lower bound confidence criterion and frontend truncation operation are employed to select the most promising local maxima as batch points. Extensive evaluations on various synthetic functions and two hyperparameter tuning problems for deep learning models are utilized to verify the proposed method. Keywords Batch Bayesian optimization · Adaptive local search · Parallel search · Hyperparameter tuning
1 Introduction BO has become a popular approach for black-box and nonconvex global optimization problems in many scientific and engineering areas, including automatic machine learning [8, 26, 64, 66], reinforcement learning [11], robotics [43, 48, 71], and information extraction [73]. It is also prospective in many other applications, e.g., the distributed optimization of cooperative control [55, 70]. The efficiency of BO stems from the application of famous “Bayes’ theorem”, which means BO can achieve the efficient searching process through the elegant Bayesian updating mechanism. In BO, a prior model that contains our beliefs for the unknown This work is financially supported by the National Key R&D Program of China (2018YFB1701400) Chao Jiang
[email protected] Jingfei Liu [email protected] Jing Zheng [email protected] 1
School of Mechanical and Vehicle Engineering, Hunan University, ChangSha 410082, China
function is first established based on the prior information. Then one observation model (i.e., the acquisition function) could be maximized to determine the next evaluation point [26, 36, 50, 64, 67]. After that, the prior information is augmented and the prior model is updated to be a more credible posterior model. The optimization process is achieved by executing the above procedures cyclically. Traditional BO is a sequential optimization method, which means only one point is evaluated at each iteration [11, 29]. The sequential selection strategy would be less efficient when parallel computational and experimental resources are available, e.g., tuning the hyperparameters for deep learning models in par
Data Loading...