Parts-of-Speech tagging for Malayalam using deep learning techniques

PDF / 708,360 Bytes
8 Pages / 595.276 x 790.866 pts Page_size
37 Downloads / 255 Views

ORIGINAL RESEARCH

Parts-of-Speech tagging for Malayalam using deep learning techniques K. K. Akhil1 • R. Rajimol1 • V. S. Anoop1

Received: 4 January 2020 / Accepted: 1 June 2020 Ó Bharati Vidyapeeth’s Institute of Computer Applications and Management 2020

Abstract Parts-of-speech tagging is a process in linguistics which deals with tagging each word in a sentence with their corresponding parts-of-speech. This process is considered to be one of the pre-processing steps for many natural language processing tasks. Earlier approaches were based on simple heuristics and later several methods were reported in the literature that incorporated machine learning techniques such as artificial neural networks. Very recently, with the advancement of deep learning-based approaches, parts-of-speech tagging process became more accurate and a reasonable number of taggers are now available for high resource languages such as English. But the low resource languages such as Malayalam is still lacking computationally efficient and accurate methods and techniques for parts-of-speech tagging. In this direction, this work proposes a deep learning-based approach for parts-of-speech tagging for the Malayalam language. Experiments conducted on real datasets show that the proposed method outperforms some of the already available methods in terms of precision and accuracy. Keywords Parts-of-Speech tagging Natural language processing Malayalam language Deep learning

& V. S. Anoop [email protected] 1

Indian Institute of Information Technology and ManagementKerala (IIITM-K) Technopark Campus, Thiruvananthapuram, Kerala 695581, India

1 Introduction Parts-of-Speech (POS) tagging is defined as the process of labeling each word in a sentence with a tag that mentions the usage of that particular word in the sentence. A basic POS tagger will usually classify the words into noun, verb, adjective, etc. but there are advanced taggers which can give additional labels such as numbers, gender, etc. A vast array of language processing systems use POS tagging as a pre-processing step which improves the precision and recall of such systems to a great extent. The pos tagged (annotated) language corpus largely find applications in speech recognition and analysis, information retrieval and other NLP tasks. Even though many approaches have been reported in the literature which proposed better ways for tagging the parts-of-speech, the approaches can be classified mainly into two - Rule-based and Machine Learning based approaches. The rule-based approaches use pre-defined rules handcrafted by humans. But assigning the tag to a word using manual process is very tedious and timeconsuming. On the other hand, there are machine learningbased approaches that use various stochastic and probabilistic techniques for labeling the POS. All these approaches achieve reasonable accuracy for resource-rich languages such as English but not performing satisfactorily for other low resource languages including most of the Indic Languages. Malayalam belongs to the Dra

Data Loading...

Parts-of-Speech tagging for Malayalam using deep learning techniques

Recommend Documents

Video Tagging and Recommender System Using Deep Learning

Human Action Detection Using Deep Learning Techniques

Cryptographic Algorithm Identification Using Deep Learning Techniques

Safety Device for Children Using IoT and Deep Learning Techniques

Medical Image Tagging by Deep Learning and Retrieval

Using Deep Learning Techniques in Detecting Lung Cancer

Identification of Intra-abdominal Organs Using Deep Learning Techniques

Performance Enhancement of Gene Mention Tagging by Using Deep Learning and Biomedical Named Entity Recognition

Deep Learning Techniques in Image Description

Deep Learning Techniques for Biomedical and Health Informatics

An Overview of Deep Learning Techniques for Biometric Systems

Deep Learning Techniques for Behavioral Malware Analysis in Cloud IaaS