Learning Without Forgetting

When building a unified vision system or gradually adding new capabilities to a system, the usual assumption is that training data for all tasks is always available. However, as the number of tasks grows, storing and retraining on such data becomes infeas

PDF / 740,341 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
44 Downloads / 198 Views

DOWNLOAD

REPORT

Abstract. When building a uniﬁed vision system or gradually adding new capabilities to a system, the usual assumption is that training data for all tasks is always available. However, as the number of tasks grows, storing and retraining on such data becomes infeasible. A new problem arises where we add new capabilities to a Convolutional Neural Network (CNN), but the training data for its existing capabilities are unavailable. We propose our Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities. Our method performs favorably compared to commonly used feature extraction and ﬁne-tuning adaption techniques and performs similarly to multitask learning that uses original task data we assume unavailable. A more surprising observation is that Learning without Forgetting may be able to replace ﬁne-tuning as standard practice for improved new task performance. Keywords: Convolutional neural networks · Transfer learning task learning · Deep learning · Visual recognition

1

· Multi-

Introduction

Many practical vision applications require learning new visual capabilities while maintaining performance on existing ones. For example, a robot may be delivered to someone’s house with a set of default object recognition capabilities, but new site-speciﬁc object models need to be added. Or for construction safety, a system can identify whether a worker is wearing a safety vest or hard hat, but a superintendent may wish to add the ability to detect improper footware. Ideally, the new tasks could be learned while sharing parameters from old ones, without degrading performance on old tasks or having access to the old training data. Legacy data may be unrecorded, proprietary, or simply too cumbersome to use in training a new task. Though similar in spirit to transfer, multitask, and lifelong learning, we are not aware of any work that provides a solution to the problem of continually adding new prediction tasks based on adapting shared parameters without access to training data for previously learned tasks. In this paper, we demonstrate a simple but eﬀective solution on a variety of image classiﬁcation problems with Convolutional Neural Network (CNN) classiﬁers. In our setting, a CNN has a set of shared parameters θs (e.g., ﬁve convolutional layers and two fully connected layers for AlexNet [11] architecture), c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part IV, LNCS 9908, pp. 614–629, 2016. DOI: 10.1007/978-3-319-46493-0 37

Learning Without Forgetting

615

Duplicating and Feature Joint Learning without Fine Tuning Fine Tuning Extraction Training Forgetting new task performance good original task performance X bad training eﬃciency fast testing eﬃciency fast storage requirement medium requires previous task data no

good good fast X slow X large no

X medium good fast fast medium no

best good X slow fast X large X yes

best good fast fast medium no

Fig. 1. We wish to add new prediction tasks to an existing CNN vi

Data Loading...

Learning Without Forgetting

Recommend Documents

Directed Forgetting

Dynamic Memory to Alleviate Catastrophic Forgetting in Continuous Learning Settings

Remembering and Forgetting in Mostar

Unsupervised Machine Learning: Datasets Without Outcomes

Suppressing negative materials for remitted depressed individuals: Substitution forgetting and incidental forgetting str

More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning

Remembering and Forgetting in the Digital Age

Enhancing network modularity to mitigate catastrophic forgetting

Against Institutionalised Forgetting: Memory Politics from Below in Postwar Prijedor

A deep learning framework for face verification without alignment

Agreeing to disagree: active learning with noisy labels without crowdsourcing

Reparameterizing Convolutions for Incremental Multi-Task Learning Without Task Interference