Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks

Detecting paraphrases in Indian languages require critical analysis on the lexical, syntactic and semantic features. Since the structure of Indian languages differ from the other languages like English, the usage of lexico-syntactic features vary between

PDF / 618,067 Bytes
13 Pages / 439.37 x 666.142 pts Page_size
48 Downloads / 221 Views

DOWNLOAD

REPORT

Abstract. Detecting paraphrases in Indian languages require critical analysis on the lexical, syntactic and semantic features. Since the structure of Indian languages diﬀer from the other languages like English, the usage of lexico-syntactic features vary between the Indian languages and plays a critical role in determining the performance of the system. Instead of using various lexico-syntactic similarity features, we aim to apply a complete end-to-end system using deep learning networks with no lexicosyntactic features. In this paper we exploited the encoder-decoder model of deep neural network to analyze the paraphrase sentences in Tamil language and to classify. In this encoder-decoder model, LSTM, GRU units and gNMT are used as layers along with attention mechanism. Using this end-to-end model, there is an increase in f1-measure by 0.5% for the subtask-1 when compared to the state-of-the-art systems. The system was trained and evaluated on DPIL@FIRE2016 Shared Task dataset. To our knowledge, ours is the ﬁrst deep learning model which validates the training instances of both the subtask-1 and subtask-2 dataset of DPIL shared task. Keywords: Tamil paraphrase detection · Deep learning · Encoder-decoder model · Sequence-to-sequence (Seq-2-Seq)

1

Introduction

Recent advances in deep neural network architecture has motivated many researchers in natural language processing to apply for diﬀerent tasks such as PoS, language modeling, NER, relation extraction, paraphrase detection, semantic role labeling etc., Paraphrase detection is one of primary and important task for many NLP applications. The two sentences which convey the same meaning in a language are said to be semantically correct or paraphrases. Paraphrases can be detected, extracted and can be generated. Paraphrase detection is an important task in paraphrase generation and extraction system. In paraphrase generation, the paraphrase detection improves the quality by picking up the best semantic equivalent paraphrase from the list of generated sentences. In paraphrase extraction, the detection of paraphrase plays a vital role in validating c IFIP International Federation for Information Processing 2020 Published by Springer Nature Switzerland AG 2020 A. Chandrabose et al. (Eds.): ICCIDS 2020, IFIP AICT 578, pp. 30–42, 2020. https://doi.org/10.1007/978-3-030-63467-4_3

Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks

31

the extracted sentences. Apart from this, paraphrase detection is also utilized in document summarization, plagiarism detection, question-answering system etc. Paraphrases in sentences are detected in earlier systems using diﬀerent traditional machine learning techniques. Recently paraphrase detection was explored in some regional Indian languages. One of the shared task which was ﬁrst of its kind for Indian languages – Detecting Paraphrases in Indian languages – DPIL@FIRE2016 [9,10] highlighted the performance of diﬀerent systems using traditional machine learning techniques. The issues associated with these systems are:

Data Loading...

Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks

Recommend Documents

Paraphrase detection using LSTM networks and handcrafted features

Water Leakage Detection Using Neural Networks

Bearing Fault Detection Using Artificial Neural Networks and Genetic Algorithm

Detection of SQL Injection Attack Using Neural Networks

Sarcasm Detection of Media Text Using Deep Neural Networks

Aphids Detection on Lemons Leaf Image Using Convolutional Neural Networks

A Comparative Study: Glaucoma Detection Using Deep Neural Networks

Real-Time Plume Detection and Segmentation Using Neural Networks

Fingerprint Spoof Detection Using Contrast Enhancement and Convolutional Neural Networks

Drone-Based Cattle Detection Using Deep Neural Networks

Handwriting and Hand-Sketched Graphics Detection Using Convolutional Neural Networks

Object Detection with Convolutional Neural Networks