Deep curriculum learning optimization
- PDF / 2,607,678 Bytes
- 14 Pages / 595.276 x 790.866 pts Page_size
- 7 Downloads / 233 Views
ORIGINAL RESEARCH
Deep curriculum learning optimization Henok Ghebrechristos1 · Gita Alaghband1 Received: 31 March 2020 / Accepted: 11 July 2020 © Springer Nature Singapore Pte Ltd 2020
Abstract We describe a quantitative and practical framework to integrate curriculum learning (CL) into deep learning training pipeline to improve feature learning in deep feed-forward networks. The framework has several unique characteristics: (1) dynamicity—it proposes a set of batch-level training strategies (syllabi or curricula) that are sensitive to data complexity (2) adaptivity—it dynamically estimates the effectiveness of a given strategy and performs objective comparison with alternative strategies making the method suitable both for practical and research purposes. (3) Employs replace–retrain mechanism when a strategy is unfit to the task at hand. In addition to these traits, the framework can combine CL with several variants of gradient descent (GD) algorithms and has been used to generate efficient batch-specific or data-set specific strategies. Comparative studies of various current state-of-the-art vision models, such as FixEfficentNet and BiT-L (ResNet), on several benchmark datasets including CIFAR10 demonstrate the effectiveness of the proposed method. We present results that show training loss reduction by as much as a factor 5. Additionally, we present a set of practical curriculum strategies to improve the generalization performance of select networks on various datasets. Keywords Curriculum learning optimization · Convolutional neural network · Deep learning · Information theory · Syllabus · Curriculum strategy
Introduction Curriculum learning, which initially, in the context of machine learning, was formalized by Bengio et al. has in recent years gained some traction as a potential technique to further improve deep learning [1–5]. The general question CL attempts to answer is the question of how to find ordering of samples in which to supply and effectively train a model for a given task. Most curriculum learning techniques get their inspiration from human learning where training is highly organized, based on education system and a curriculum which usually enables learning concepts in gradually This article is part of the topical collection “Machine Learning inPattern Analysis” guest edited by Reinhard Klette, Brendan McCane,Gabriella Sanniti di Baja, Palaiahnakote Shivakumara and LiangWang. * Henok Ghebrechristos [email protected] Gita Alaghband [email protected] 1
Department of Computer Science, University of Colorado, Denver, CO 80014, USA
increasing levels of difficulty while considering previously learned concepts [1]. In machine learning, CL attempts to find some optimal sequence of training input (or training tasks for transfer learning) in which to present to the learning system to optimize the learning process compared to no-curriculum training. In this text, we extend curriculum learning based on ranking or weighing (as defined by Bengio et al.) of individual sampl
Data Loading...