Sanskrit to universal networking language EnConverter system based on deep learning and context-free grammar
- PDF / 2,258,469 Bytes
- 17 Pages / 595.276 x 790.866 pts Page_size
- 82 Downloads / 176 Views
SPECIAL ISSUE PAPER
Sanskrit to universal networking language EnConverter system based on deep learning and context‑free grammar Sitender1,2 · Seema Bawa1
© Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Machine Translation is a mechanism of transforming text from one language to another with the help of computer technology. Earlier in 2018, a machine translation system had been developed by the authors that translate Sanskrit text to Universal Networking Language expressions and was named as SANSUNL. The work presented in this paper is an extension of SANSUNL system by enhancing POS tagging, Sanskrit language processing and parsing. A Sanskrit stemmer having 23 prefixes and 774 suffixes with grammar rules are used for stemming the Sanskrit sentence in the proposed system. Bidirectional long short-term memory (Bi-LSTM) and stacked LSTM deep neural network models have been used for part of speech tagging of the input Sanskrit text. A tagged dataset of around 400 k entries for Sanskrit have been used for training and testing the neural network models. Proposed Sanskrit context-free grammar has been used with CYK parser to perform the parsing of the input sentence. Size of the Sanskrit-Universal Word dictionary has been increased from 15000 to 25000 entries. Approximately 1500 UNL generation rules have been used to resolve the 46 UNL relations. Four datasets UC-A1, UC-A2, Spanish server gold standard dataset, and 500 Sanskrit sentences taken from the general domain have been used for validating the system. The proposed system is evaluated on BLEU and Fluency score metrics and has reported an efficiency of 95.375%. Keywords Artificial intelligence · CFG · Deep neural network · LSTM · Machine translation · Sanskrit · UNL Abbreviations AI Artificial intelligence BiLSTM Bi-directional long short-term memory SLSTM Stacked long short-term memory BLEU Bilingual evaluation understudy CFG Context-free grammar CNF Chomsky normal form CYK Cocke–Younger–Kasami MT Machine translation MTS Machine translation system POS Part of speech TLGR Target language generation rule UNL Universal networking Language
* Sitender [email protected]; [email protected] Seema Bawa [email protected] 1
Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, India
MSIT, New Delhi 110058, India
2
1 Introduction Machine translation is a mechanism of converting text from one natural language to another language with the help of computer systems. Till now, no MT system with 100% accuracy and domain independent has been developed. MT is a sub-domain of natural language processing which in turn is a sub-domain of artificial intelligence. This article presents a MT system for translation of Sanskrit text to Universal networking language expressions. The SANSUNL system was the first MT system to transfer Sanskrit text to Universal Networking Language expressions [1]. Sanskrit is one of the oldest languages in world and more suitable for computer programming due to its systematic
Data Loading...