On the Complexity of the Smallest Grammar Problem over Fixed Alphabets

  • PDF / 1,264,295 Bytes
  • 66 Pages / 439.642 x 666.49 pts Page_size
  • 17 Downloads / 182 Views

DOWNLOAD

REPORT


On the Complexity of the Smallest Grammar Problem over Fixed Alphabets Katrin Casel1 · Henning Fernau2 · Serge Gaspers3 · Benjamin Gras4 · Markus L. Schmid5 Accepted: 9 October 2020 © The Author(s) 2020

Abstract In the smallest grammar problem, we are given a word w and we want to compute a preferably small context-free grammar G for the singleton language {w} (where the size of a grammar is the sum of the sizes of its rules, and the size of a rule is measured by the length of its right side). It is known that, for unbounded alphabets, the decision variant of this problem is NP-hard and the optimisation variant does not allow a polynomial-time approximation scheme, unless P = NP. We settle the long-standing

This work represents an extended version of the paper “On the Complexity of Grammar-Based Compression over Fixed Alphabets” presented at the 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016) and published in LIPIcs - Leibniz International Proceedings in Informatics (https://doi.org/10.4230/LIPIcs.ICALP.2016.122)  Markus L. Schmid

[email protected] Katrin Casel [email protected] Henning Fernau [email protected] Serge Gaspers [email protected] Benjamin Gras [email protected] 1

Hasso Plattner Institute, University of Potsdam, Potsdam, Germany

2

Fachbereich IV – Abteilung Informatikwissenschaften, Trier University, Trier, 54296, Germany

3

UNSW Australia, Data61 (formerly: NICTA), CSIRO, Sydney, Australia

4

´ Ecole Normale Superieure de Lyon, D´epartement Informatique, Lyon, France

5

Humboldt-Universit¨at zu Berlin, Unter den Linden 6, 10099, Berlin, Germany

Theory of Computing Systems

open problem whether these hardness results also hold for the more realistic case of a constant-size alphabet. More precisely, it is shown that the smallest grammar problem remains NP-complete (and its optimisation version is APX-hard), even if the alphabet is fixed and has size of at least 17. The corresponding reduction is robust in the sense that it also works for an alternative size-measure of grammars that is commonly used in the literature (i. e., a size measure also taking the number of rules into account), and it also allows to conclude that even computing the number of rules required by a smallest grammar is a hard problem. On the other hand, if the number of nonterminals (or, equivalently, the number of rules) is bounded by a constant, then the smallest grammar problem can be solved in polynomial time, which is shown by encoding it as a problem on graphs with interval structure. However, treating the number of rules as a parameter (in terms of parameterised complexity) yields W[1]-hardness. Furthermore, we present an O(3|w| ) exact exponential-time algorithm, based on dynamic programming. These three main questions are also investigated for 1-level grammars, i. e., grammars for which only the start rule contains nonterminals on the right side; thus, investigating the impact of the “hierarchical depth” of grammars on the complexity of the smallest g