Fine-grained entity type classification using GRU with self-attention

  • PDF / 693,043 Bytes
  • 10 Pages / 595.276 x 790.866 pts Page_size
  • 75 Downloads / 202 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH

Fine-grained entity type classification using GRU with self-attention K. Dhrisya1 • G. Remya1 • Anuraj Mohan1

Received: 16 December 2019 / Accepted: 29 June 2020  Bharati Vidyapeeth’s Institute of Computer Applications and Management 2020

Abstract Natural language processing is an application of a computational technique that allows the machine to process human language. One of the primary tasks of NLP is information extraction that aims to capture important information from the text. Nowadays, the fast-growing web contains a large amount of textual information, requires a technique to extract relevant information. The entity recognition task is a type of information extraction that attempts to find and classify named entities appearing in the unstructured text document. The traditional coarsegrained entity recognition systems often define a less number of pre-defined named entity categories such as person, location, organization, and date. The fine-grained entity type classification model focused to classify the target entities into fine-grained types. Most of the recent works are accomplished with the help of Bidirectional LSTM with an attention mechanism. But due to the complex structure of bidirectional LSTM, these models consume an enormous amount of time for the training process. The existing attention mechanisms are incapable to pick up the correlation between the new word and the previous context. The proposed system resolves this issue by utilizing bidirectional GRU with the self-attention mechanism. The experiment result shows that the novel approach outperforms state-of-the-art methods.

& K. Dhrisya [email protected] G. Remya [email protected] Anuraj Mohan [email protected] 1

N.S.S College of Engineering, Palakkad, Kerala, India

Keywords NLP  Information extraction  Classification  Named entity  GRU  Attention

1 Introduction The existing world of the internet consists of a large volume of text data available in an unstructured format. Natural language processing (NLP) [1] is a form of artificial intelligence that enables the machine to process the text document written in human language. It is very difficult to filter core information from the text data increasing at an astonishing speed. Hence, the problem of processing those data has been resolved by machine learning techniques that give systems the capability to automatically learns and improve from experience without using explicit instructions. Named entity recognition [2] is such an innovative technology that implements many NLP applications efficiently by processing unstructured text data and filtering structured data automatically. Entity recognition is a subtask of information extraction [3], targets to find out known entity names from the document by employing existing knowledge of the domain. For instance, online news publishing houses generate a huge amount of content each day and maintaining them perfectly is a key task to get the most useful article. Named Entity Recognition (NER) system automat