Active neural learners for text with dual supervision
- PDF / 930,402 Bytes
- 20 Pages / 595.276 x 790.866 pts Page_size
- 59 Downloads / 203 Views
(0123456789().,-volV)(0123456789(). ,- volV)
ORIGINAL ARTICLE
Active neural learners for text with dual supervision Chandramouli Shama Sastry1 • Evangelos E. Milios1 Received: 29 July 2019 / Accepted: 10 December 2019 Springer-Verlag London Ltd., part of Springer Nature 2020
Abstract Dual supervision for text classification and information retrieval, which involves training the machine with class labels augmented with text annotations that are indicative of the class, has been shown to provide significant improvements, both in and beyond active learning (AL) settings. Annotations in the simplest form are highlighted portions of the text that are indicative of the class. In this work, we aim to identify and realize the full potential of unsupervised pretrained word embeddings for text-related tasks in AL settings by training neural nets—specifically, convolutional and recurrent neural nets—through dual supervision. We propose an architecture-independent algorithm for training neural networks with human rationales for class assignments and show how unsupervised embeddings can be better leveraged in active learning settings using the said algorithm. The proposed solution involves the use of gradient-based feature attributions for constraining the machine to follow the user annotations; further, we discuss methods for overcoming the architecture-specific challenges in the optimization. Our results on the sentiment classification task show that one annotated and labeled document can be worth up to seven labeled documents, giving accuracies of up to 70% for as few as ten labeled and annotated documents, and shows promise in significantly reducing user effort for total-recall information retrieval task in systematic literature reviews. Keywords Active learning Annotations Gradient-based attributions Convolutional neural networks Recurrent neural networks Dual supervision
1 Introduction Active learning (AL) is an iterative learning process wherein the machine cleverly chooses data points to present to the human to elicit labels which are most helpful in inducing the best classifier. While the learning usually starts with a completely unlabeled dataset, the machine starts accumulating data through a sequence of carefully chosen queries. The goal of the learning algorithm is to induce the best classifier for a given budget—which is usually defined in terms of the user effort. Usually, the unlabeled pool is much larger than the labeled pool and algorithms which can make use of large amount of unlabeled data can benefit the most in the AL setting.
& Chandramouli Shama Sastry [email protected] Evangelos E. Milios [email protected] 1
Faculty of Computer Science, Dalhousie University, Halifax, Canada
Distributed word representations induced from unlabeled text data, which capture several syntactic and semantic relationships between the words, have advanced the state of the art in several natural language processing tasks such as text classification, question answering, named-entity recognition. However, convolutional
Data Loading...