A Platform for the Conceptualization of Arabic Texts Dedicated to the Design of the UML Class Diagram

In many fields using information systems (IS), knowledge is often represented by UML models, in particular, by including class diagrams. This formalism has the advantage of being controlled by a large community and therefore the perfect means of exchange.

  • PDF / 816,159 Bytes
  • 6 Pages / 439.37 x 666.142 pts Page_size
  • 44 Downloads / 140 Views

DOWNLOAD

REPORT


2

1 LRIIR, Department of Computer Science, University of Oran 1, Ahmed Ben Bella, Oran, Algeria [email protected], [email protected] National Institute of Telecommunication and Information and Communication Technology of Oran (INTTIC), Oran, Algeria [email protected] 3 MIRACL, Faculty of Economics and Management of Sfax (FSEGS), Department of Computer Science, University of Sfax, Sfax, Tunisia [email protected]

Abstract. In many fields using information systems (IS), knowledge is often represented by UML models, in particular, by including class diagrams. This formalism has the advantage of being controlled by a large community and therefore the perfect means of exchange. The desire to use an automated tool that formalizes the intellectual process of the expert from an IS specification texts seems interesting. Our problem is devoted to the presentation of a new strategy that allows us to move from an informal to a semi-formal representation model, which is the UML class diagram. This issue is not new. It has aroused great interest for a long time. The originality of our work is that these texts are in Arabic.

1 Introduction The specification of an IS is a mutual work done between the client, who is the only one to really know the problem and the designers who need to be helped to express it clearly. These needs have to be simple and specified in a language that can be understood by different sides of the project. The development of semi-formal models like UML and its derivatives from requirements specification can be long and tedious. Its automation leads to many challenges in different scientific fields such as requirements engineering, knowledge representation, the automatic processing of language, information extraction and knowledge engineering. Several works are been interested to theses researches in the past [1, 2] and recently [3–5]. Also, many studies have focused on automating and semi automating this process to extract a UML model from natural language text [4–7]. The approaches proposed use most of time NLP (Natural Language Processing) techniques. The lack of formal semantics which hampers the UML language can lead to serious modeling © Springer International Publishing Switzerland 2016 E. Métais et al. (Eds.): NLDB 2016, LNCS 9612, pp. 447–452, 2016. DOI: 10.1007/978-3-319-41754-7_47

448

K.Z. Bousmaha et al.

problems that generate contradictions in the developed models [3]. This motivated more work on the transition from UML to formal languages such as B [8], VDM [1], VDM++ [9], Z [10], Maude [11]. We propose in this paper a platform for the conceptualization of the texts of specification of an information system, dedicated to the design of the UML class diagram based on hybrid approaches that encompass both linguistic and statistical approaches. The original contribution of our work is the fact that the specification texts are in Arabic. The Arabic language is especially challenging because of its complex linguistic structures. It has rich and complex morphological, grammatical, and se