TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition

PDF / 1,437,554 Bytes
14 Pages / 595.224 x 790.955 pts Page_size
22 Downloads / 237 Views

TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition DunLu Peng1

· YinRui Wang1 · Cong Liu1 · Zhang Chen1

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract Most of the current research on Named Entity Recognition (NER) in the Chinese domain is based on the assumption that annotated data are adequate. However, in many scenarios, the sufficient amount of annotated data required for Chinese NER task is difficult to obtain, resulting in poor performance of machine learning methods. In view of this situation, this paper tries to excavate the information contained in the massive unlabeled raw text data and utilize it to enhance the performance of Chinese NER task. A deep learning model combined with Transfer Learning technique is proposed in this paper. This method can be leveraged in some domains where there is a large amount of unlabeled text data and a small amount of annotated data. The experiment results show that the proposed method performs well on different sized datasets, and this method also avoids errors that occur during the word segmentation process. We also evaluate the effect of transfer learning from different aspects through a series of experiments. Keywords Transfer learning · Chinese named entity recognition · Natural language processing · Deep learning

1 Introduction Named Entity Recognition(NER) technology is one of the core tasks in the field of Natural Language Processing(NLP) research, which aims to automatically discover information entities and identify their corresponding categories from data (Nadeau and Sekine 2007). Efficient and accurate identification of entity information contained in text is of great significance for computer processing text data. In the field of NLP research, a number of highlevel tasks, such as information retrieval, knowledge graph,

DunLu Peng

[email protected] YinRui Wang [email protected] Cong Liu [email protected] Zhang Chen [email protected] 1

School of Optical-Electrical and Computer Engineer, University of Shanghai for Science and Technology, Shanghai, 200093, China

sentiment analysis (Smith et al. 2018) and question answering system (Agrawal et al. 2015), need the NER task as one of their basic components. For example, in information retrieval, it is necessary to identify the relevant entity information from the input text to achieve accurate searching results (Nemeskey and Kornai 2018); the questionanswering system requires to identify the entity types and their correlation in order to better answer questions. The efficiency and accuracy of conducting NER will affect subsequent tasks. Therefore, it is of great value to conduct in-depth research on NER. Compared to other languages, our observations maintain the Chinese NER task is more difficult, because of the following causes: 1) there is no clear boundary between two words adjacent to each other in Chinese, so one intuitive way of performing Chinese NER is to perform word segmentation first. However, incorrectly segmented entity boundaries wi

Data Loading...

TL-NER: A Transfer Learning Model for Chinese Named Entity Recognition

Recommend Documents

Cross-Lingual Transfer Learning for Medical Named Entity Recognition

ALBERT-Based Chinese Named Entity Recognition

A Neural Framework for Chinese Medical Named Entity Recognition

Reinforcement Learning for Named Entity Recognition from Noisy Data

BERT-Based Named Entity Recognition in Chinese Twenty-Four Histories

A Survey on Named Entity Recognition

Multi-layer Joint Learning of Chinese Nested Named Entity Recognition Based on Self-attention Mechanism

A deep neural network-based model for named entity recognition for Hindi language

Development of Kazakh Named Entity Recognition Models

BiGCNN: Bidirectional Gated Convolutional Neural Network for Chinese Named Entity Recognition

When to Use OCR Post-correction for Named Entity Recognition?

Named Entity Recognition for Icelandic: Annotated Corpus and Models