Basic Arithmetic Coding Based Approach to Compress a Character String

Data compression plays an important role for storing and transmitting text or multimedia information. This paper refers to a lossless data algorithm is developed in C-platform to compress character string based on Basic Arithmetic Coding. At the prelimina

  • PDF / 237,791 Bytes
  • 8 Pages / 439.37 x 666.142 pts Page_size
  • 120 Downloads / 221 Views

DOWNLOAD

REPORT


Abstract Data compression plays an important role for storing and transmitting text or multimedia information. This paper refers to a lossless data algorithm is developed in C-platform to compress character string based on Basic Arithmetic Coding. At the preliminary stage, this algorithm was tested for the character array comprising of vowels only and the probability distribution is assumed arbitrarily. The result being obtained is encouraging with compression ratio far beyond unity. Though the algorithm was tested for vowels only but the work can be extended for any character array with probability of distribution as obtained from the survey of few randomly selected articles.



Keywords Data compression technique Basic arithmetic coding distribution Encoding–Decoding Compression ratio







Probability

1 Introduction In the present age of digitization, data compression becomes extremely important for reducing the bit size of the data. With reduced number of bits, there will reduced memory requirement thereby eliminating the memory constraints of the system. In context of data communication, reduced number of bits implies lesser energy requirement thereby leading toward energy efficiency. Data compression not only reduces the data size but it also has the inherent capability of data encryption thereby ensuring data security. A typical data compression algorithm can be represented by the block diagram, as given in Fig. 1 [1–3].

I. Mondal (✉) Department of CSE, Techno India Batanagar, Kolkata, India e-mail: [email protected] S.J. Sarkar Department of EE, Techno India Batanagar, Kolkata, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2017 S.C. Satapathy et al. (eds.), Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, Advances in Intelligent Systems and Computing 515, DOI 10.1007/978-981-10-3153-3_4

31

32

I. Mondal and S.J. Sarkar Input character string

Decoded character string

Encoding Algorithm Primary or Secondary Memory for storing encrypted string Decoding Algorithm

Fig. 1 Block diagram of the proposed system

There are numerous methods of data compression. Broadly, the compression can be classified as lossy or lossless compression. In lossy compression, there is some removal of some unimportant data values present in the file while performing these algorithms. Some of its examples include transform coding, Karhunen–Loeve Transform (KLT) coding, wavelet-based coding, etc. Real-time applications of these compression algorithms are in compression of multimedia files like audio, video, images, etc. [1]. On the other hand, there is no loss of data information in lossless data compression techniques like Shannon–Fano algorithm, Huffman algorithm, arithmetic Coding, etc. [1, 4]. Lossless data compression is more popular for compressing text documents, images of higher importance like image of cancerous tissues, etc. [4]. The application of the work done in [1] was confined to the compression of data string for power syst