Joe Loves Lea: Transformational Analysis of Direct Transitive Sentences

NooJ is capable of both parsing and producing any sentence that matches a given syntactic grammar. We use this functionality to describe direct transitive sentences, and we show that this simple structure of sentence accounts for millions of potential sen

  • PDF / 9,035,225 Bytes
  • 11 Pages / 439.37 x 666.14 pts Page_size
  • 96 Downloads / 145 Views

DOWNLOAD

REPORT


)

ELLIADD, Université de Franche-Comté, Besançon, France [email protected]

Abstract. NooJ is capable of both parsing and producing any sentence that matches a given syntactic grammar. We use this functionality to describe direct transitive sentences, and we show that this simple structure of sentence accounts for millions of potential sentences. Keywords: Nooj · Syntactic analysis · Transformational analysis · Transformational grammar

1

Introduction

NooJ allows linguists to formalize various types of linguistic description: orthography and spelling, lexicons for simple words, multiword units and frozen expressions, inflec‐ tional and derivational morphology, local, structural and transformational syntax. One important characteristic of NooJ is that all the linguistic descriptions are reversible, i.e. they can be used both by a parser (to recognize sentences) as well as a generator (to produce sentences). (Silberztein 2011) and (Silberztein 2016) show how, by combining a parser and a generator and applying them to a syntactic grammar, we can build a system that takes one sentence as its input, and produce all the sentences that share the same lexical material with the original sentence. Here are two simple transformations1: – [Pron-0] Joe loves Lea = He loves Lea – [Passive] Joe loves Lea = Lea is loved by Joe The second one can be implemented in NooJ via the following grammar: This graph uses three variables $NO, $V and $N1. When parsing the sentence Joe loves Lea, the variable $N0 stores the word Joe, $V stores the word loves and $N1 stores Lea. The grammar’s output “$N1 is $V_V+PP by $N0” produces the string Lea is loved by Joe. Note that morphological operations such as “$V_V+PP”, operate on NooJ’s Atomic Linguistic Units (ALUs) rather than plain strings; in other words, NooJ knows that the word form loves is an instance of the verb to love and it can produce all the conjugated

1

I am using the term transformation as in (Harris 1968): an operator that links sentences that share common semantic material, as opposed to (Chomsky 1957) whose transformations link deep and surface structures.

© Springer International Publishing Switzerland 2016 T. Okrut et al. (Eds.): NooJ 2015, CCIS 607, pp. 55–65, 2016. DOI: 10.1007/978-3-319-42471-2_5

56

M. Silberztein

Fig. 1. The [Passive] transformation

and derived word forms from this ALU (e.g. “loving”, “lovers”). Here, $V_V+PP takes the value of variable $V (loves), lemmatizes it (love), produces all its verb forms and selects the ones that have property +PP (i.e. Past Participle) to get the result loved. One application of this rewriting system is Machine Translation, whereas one grammar recognizes sentences in one input language, and produces the corresponding “rewritten” sentences in another language, see for instance (Barreiro 2008) for Portu‐ guese-English translation, (Fehri et al. 2010) for Arabic-French translation and (Ben et al. 2015) for Arabic-English translation. As (Silberztein 2016) has shown, any serious attempt at describing a significan