Accelerated Algorithms for Unconstrained Convex Optimization
This chapter reviews the representative accelerated first-order algorithms for deterministic unconstrained convex optimization. We start with introducing the accelerated methods for smooth problems with Lipschitz continuous gradients, then concentrate on
- PDF / 2,840,280 Bytes
- 286 Pages / 439.42 x 683.15 pts Page_size
- 32 Downloads / 280 Views
ccelerated Optimization for Machine Learning First-Order Algorithms
Accelerated Optimization for Machine Learning
Zhouchen Lin • Huan Li • Cong Fang
Accelerated Optimization for Machine Learning First-Order Algorithms
Zhouchen Lin Key Lab. of Machine Perception School of EECS Peking University Beijing, Beijing, China
Huan Li College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics Nanjing, Jiangsu, China
Cong Fang School of Engineering and Applied Science Princeton University Princeton, NJ, USA
ISBN 978-981-15-2909-2 ISBN 978-981-15-2910-8 (eBook) https://doi.org/10.1007/978-981-15-2910-8 © Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
To our families. Without your great support this book will not exist and even our careers will be meaningless.
Foreword by Michael I. Jordan
Optimization algorithms have been the engine that have powered the recent rise of machine learning. The needs of machine learning are different from those of other disciplines that have made use of the optimization toolbox; most notably, the parameter spaces are of high dimensionality, and the functions that are being optimized are often sums of millions of terms. In such settings, gradient-based methods are preferred over higher order methods, and given that the computation of a full gradient can be infeasible, stochastic gradient methods are the coin of the realm. Putting such specifications together with the need to solve nonconvex optimization problems, to control the variance induced by the stochastic sampling, and to develop algorithms that run on distributed platforms, one poses a new set of cha
Data Loading...