Using error decay prediction to overcome practical issues of deep active learning for named entity recognition

PDF / 2,447,224 Bytes
30 Pages / 439.37 x 666.142 pts Page_size
1 Downloads / 200 Views

Using error decay prediction to overcome practical issues of deep active learning for named entity recognition Haw‑Shiuan Chang1,2 · Shankar Vembu2 · Sunil Mohan2 · Rheeya Uppaal1 · Andrew McCallum1 Received: 15 December 2019 / Revised: 16 June 2020 / Accepted: 11 July 2020 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2020

Abstract Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to labeling noise, and (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers, and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well. Keywords Active learning · Transparency · Robustness to labeling noise · Black-box models · Clustering · Named entity recognition

Editors: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier. * Andrew McCallum [email protected] Haw‑Shiuan Chang [email protected] Shankar Vembu [email protected] Sunil Mohan [email protected] Rheeya Uppaal [email protected] 1

University of Massachusetts Amherst, College of Information and Computer Science, Amherst, MA, USA

2

Chan Zuckerberg Initiative (CZI), Redwood City, CA, USA

13

Vol.:(0123456789)

Machine Learning

1 Introduction Deep neural networks achieve state-of-the-art results on many tasks, especially when a large amount of training data is available. Their success highlights the importance of reducing the cost of collecting labels on a large scale. Active learning can be used to select data samples that will most benefit a predictor’s training, thereby reducing the amount of labeled data needed without hurting the predictor’s accuracy. The effectiveness of uncertainty and disagreement-based1 active learning methods have been demonstrated on several datasets for shallow predictors (Settles and Craven 2008; Settles 2009), and more recently also for deep learning predictors (Gal et al. 2017; Shen et al. 2018; Siddhant and Lipton 2018). Nevertheless, random sampling is still the most popular method to build new datasets in several domains, including natural language processing (Tomanek and Olsson 2009). This is due to the practical issues of deploying uncertainty-based active sampling (Settles 2011; Lowell et al. 2019), including its limited applicability, robustness, a

Data Loading...

Using error decay prediction to overcome practical issues of deep active learning for named entity recognition

Recommend Documents

Performance Enhancement of Gene Mention Tagging by Using Deep Learning and Biomedical Named Entity Recognition

Cross-Lingual Transfer Learning for Medical Named Entity Recognition

Reinforcement Learning for Named Entity Recognition from Noisy Data

When to Use OCR Post-correction for Named Entity Recognition?

AHIAP: An Agile Medical Named Entity Recognition and Relation Extraction Framework Based on Active Learning

Development of Kazakh Named Entity Recognition Models

Named Entity Recognition in Aircraft Design Field Based on Deep Learning

A Survey on Named Entity Recognition

ALBERT-Based Chinese Named Entity Recognition

A deep neural network-based model for named entity recognition for Hindi language

Named Entity Recognition for Icelandic: Annotated Corpus and Models

Incorporating Boundary and Category Feature for Nested Named Entity Recognition