Comparing Incremental Learning Strategies for Convolutional Neural Networks

In the last decade, Convolutional Neural Networks (CNNs) have shown to perform incredibly well in many computer vision tasks such as object recognition and object detection, being able to extract meaningful high-level invariant features. However, partly b

PDF / 1,832,155 Bytes
10 Pages / 439.37 x 666.142 pts Page_size
52 Downloads / 264 Views

DOWNLOAD

REPORT

Abstract. In the last decade, Convolutional Neural Networks (CNNs) have shown to perform incredibly well in many computer vision tasks such as object recognition and object detection, being able to extract meaningful high-level invariant features. However, partly because of their complex training and tricky hyper-parameters tuning, CNNs have been scarcely studied in the context of incremental learning where data are available in consecutive batches and retraining the model from scratch is unfeasible. In this work we compare different incremental learning strategies for CNN based architectures, targeting real-word applications. Keywords: Deep learning networks

Incremental learning

Convolutional neural

1 Introduction In recent years, deep learning has established itself as a real game changer in many domains like speech recognition [1], natural language processing [2] and computer vision [3]. The real power of deep convolutional neural networks resides in their ability of directly processing high-dimensional raw data and produce meaningful high-level representations regardless of the speciﬁc task at hand [4]. So far, computer vision has been one of the most beneﬁted ﬁelds and tasks like object recognition and object detection have seen signiﬁcant advances in terms of accuracy (ref. to ImageNet LSVRC [5] and PascalVOC [6]). However, most of the recent successes are based on benchmarks providing an enormous quantity of data with a ﬁxed training and test set. This is not always the case of real world applications where training data are often only partially available at the beginning and new data keep coming while the system is already deployed and working, like for recommendation systems and anomaly detection where learning sudden behavioral changes becomes quite critical. One possible approach to deal with this incremental scenario is to store all the previously seen past data, and retrain the model from scratch as soon as a new batch of data is available (in the following we refer to this approach as cumulative approach). However, this solution is often impractical for many real world systems where memory and computational resources are subject to stiff constraints. As a matter of fact, employing convolutional neural networks is expensive and training them on large datasets could take days or weeks, even on modern GPUs [3]. A different approach to address this issue, is to update the model based only on the new available batch of data. This is computationally © Springer International Publishing AG 2016 F. Schwenker et al. (Eds.): ANNPR 2016, LNAI 9896, pp. 175–184, 2016. DOI: 10.1007/978-3-319-46182-3_15

176

V. Lomonaco and D. Maltoni

cheaper and storing past data is not necessary. Of course, nothing comes for free, and this approach may lead to a substantial loss of accuracy with respect to the cumulative strategy. Moreover, the stability-plasticity dilemma may arise and dangerous shifts of the model are always possible because of catastrophic forgetting [7–10]. Forgetting previously learned patterns ca

Data Loading...

Comparing Incremental Learning Strategies for Convolutional Neural Networks

Recommend Documents

Deep Neural Networks: Incremental Learning

Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

Deep Learning and Convolutional Neural Networks for Medical Image Computing

Permutation Learning in Convolutional Neural Networks for Time-Series Analysis

Convolutional Neural Networks

Fetal Brain Segmentation Using Convolutional Neural Networks with Fusion Strategies

Convolutional Neural Networks for Clothes Categories

Convolutional Neural Networks for Traffic Signs Recognition

Advanced Applied Deep Learning Convolutional Neural Networks and Ob

Sensible Autonomous Machine Using Deep Learning and Convolutional Neural Networks

Self-paced hybrid dilated convolutional neural networks

Recognizing handwritten digits with convolutional neural networks