Predictive approaches for the UNIX command line: curating and exploiting domain knowledge in semantics deficit data
- PDF / 948,303 Bytes
- 21 Pages / 439.642 x 666.49 pts Page_size
- 91 Downloads / 216 Views
Predictive approaches for the UNIX command line: curating and exploiting domain knowledge in semantics deficit data Thoudam Doren Singh1 Apoorva Vikram Singh2
· Abdullah Faiz Ur Rahman Khilji1 · Divyansha1 · Surmila Thokchom3 · Sivaji Bandyopadhyay1
·
Received: 29 April 2020 / Revised: 1 September 2020 / Accepted: 19 October 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract The command line has always been the most efficient method to interact with UNIX flavor based systems while offering a great deal of flexibility and efficiency as preferred by professionals. Such a system is based on manually inputting commands to instruct the computing machine to carry out tasks as desired. This human-computer interface is quite tedious especially for a beginner. And hence, the command line has not been able to garner an overwhelming reception from new users. Therefore, to improve user-friendliness and to mark a step towards a more intuitive command line system, we propose two predictive approaches that can benefit all kinds of users specially the novice ones by integrating into the command line interface. These methods are based on deep learning based predictions. The first approach is based on the sequence to sequence (Seq2seq) model with joint learning by leveraging continuous representations of a self-curated exhaustive knowledge base (KB) comprising an all-inclusive command description to enhance the embedding employed in the model. The other is based on the attention-based transformer architecture where a pretrained model is employed. This allows the model to dynamically evolve over time making it adaptable to different circumstances by learning as the system is being used. To reinforce our idea, we have experimented with our models on three major publicly available Unix command line datasets and have achieved benchmark results using GLoVe and Word2Vec embeddings. Our finding is that the transformer based framework performs better on two different datasets of the three in our experiment in a semantic deficit scenario like UNIX command line prediction. However, Seq2seq based model outperforms bidirectional encoder representations from transformers (BERT) based model on a larger dataset. Keywords UNIX Command Line Prediction · Knowledge Base · LSTM · GLoVe · Joint Learning · BERT
Thoudam Doren Singh
[email protected]
Extended author information available on the last page of the article.
Multimedia Tools and Applications
1 Introduction The work aims at resolving the long-standing plight of unfamiliarity with the command line interface in UNIX based systems. This will not only improve the efficiency of the user but also improve the learning curve. The concerned research work treats the problem of UNIX command line prediction as a sequence prediction problem instead of the traditionally adapted provision of recommendation systems. Due to the absence of semantics between commands, we have employed an enhanced model simultaneously assimilated from a textual corpus and KB to create
Data Loading...