Active and Incremental Learning with Weak Supervision

  • PDF / 1,937,032 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 105 Downloads / 253 Views

DOWNLOAD

REPORT


TECHNICAL CONTRIBUTION

Active and Incremental Learning with Weak Supervision Clemens‑Alexander Brust1   · Christoph Käding1 · Joachim Denzler1 Received: 24 September 2019 / Accepted: 2 January 2020 © The Author(s) 2020

Abstract Large amounts of labeled training data are one of the main contributors to the great success that deep models have achieved in the past. Label acquisition for tasks other than benchmarks can pose a challenge due to requirements of both funding and expertise. By selecting unlabeled examples that are promising in terms of model improvement and only asking for respective labels, active learning can increase the efficiency of the labeling process in terms of time and cost. In this work, we describe combinations of an incremental learning scheme and methods of active learning. These allow for continuous exploration of newly observed unlabeled data. We describe selection criteria based on model uncertainty as well as expected model output change (EMOC). An object detection task is evaluated in a continuous exploration context on the PASCAL VOC dataset. We also validate a weakly supervised system based on active and incremental learning in a real-world biodiversity application where images from camera traps are analyzed. Labeling only 32 images by accepting or rejecting proposals generated by our method yields an increase in accuracy from 25.4 to 42.6%. Keywords  Active learning · Wildlife surveillance · Weak supervision · Object detection · Incremental learning

1 Introduction Deep convolutional networks (CNNs) show impressive performance in a variety of applications. Even in the challenging task of object detection, they serve as excellent models [18, 44, 45, 52, 53]. Traditionally, most research in the area of object detection builds on models trained once on reliable labeled data for a predefined application. However, in many application scenarios, new data becomes available over time or the distribution underlying the problem changes. When this happens, models are usually retrained from scratch or have to be refined via either fine-tuning [21, 45] or incremental learning [40, 51]. In any case, a human expert has to This paper is an extended version of our previous work [4], from which certain parts of Sects. 2 to 5 (except novel YOLO-specific methods) were taken verbatim. Section 8 contains some verbatim parts from our previous work [33]. * Clemens‑Alexander Brust clemens‑alexander.brust@uni‑jena.de Christoph Käding christoph.kaeding@uni‑jena.de Joachim Denzler joachim.denzler@uni‑jena.de 1



Friedrich Schiller University Jena, Jena, Germany

assign labels to identify objects and corresponding classes for every unlabeled example. When domain knowledge is necessary to assign reliable labels, this is the limiting factor in terms of effort or costs. For example, cancer experts have to manually annotate hundreds of images to provide accurately labeled data [50, 56]. Changing distributions can also pose a problem because constant relabeling is required. Self-driving cars for example should not b