Named Entity Recognition from Arabic-French Herbalism Parallel Corpora

With the adverse health effects of chemical drugs and antibiotics, herbal medicine has been a resurgence of interest in recent years. Thus, the use of medicinal plants is being largely considered as an effective and lucrative treatment, especially in Asia

  • PDF / 1,516,032 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 53 Downloads / 213 Views

DOWNLOAD

REPORT


MIRACL Laboratory, Higher Management Institute of Gabes, University of Gabes, Gabes, Tunisia [email protected], [email protected] 2 MIRACL Laboratory, Faculty of Science of Sfax, MIRACL, University of Sfax, Sfax, Tunisia [email protected]

Abstract. With the adverse health effects of chemical drugs and antibiotics, herbal medicine has been a resurgence of interest in recent years. Thus, the use of medicinal plants is being largely considered as an effective and lucrative treatment, especially in Asia and Africa. The objective of this work is to achieve an identification system of medicinal plants names from French-Arabic parallel corpora. Corpora are formed by several texts composed from the multilingual encyclopedia Wikipedia. The identification of Named Entities is realized by several types of patterns. These patterns are represented by a set of transducers. The prototype is implemented in NooJ linguistic platform using a set of morphological and syntactic grammars. This prototype is experimented on a French-Arabic parallel corpora collected from Wikipedia. The obtained results are promising given the measures values. Keywords: Named entity recognition analysis  NooJ platform



Herbalism



Morpho-syntactic

1 Introduction The Named Entities (NEs) has been a very active field of research for many years. The concept of NEs covers not only proper names but also more complex entities such as multi-word expressions. The NEs are usually typed by taxonomies more or less vast and strongly dependent on the scope or considered needs. They typically cover names designating persons, places or organizations but can also refer to more technical concepts such as diseases. The ability to determine the NEs in a text has been established as an important task for several natural language processing areas, including information retrieval, machine translation, information extraction and language understanding [4]. At one time, herbalism was a honorable profession that laid the foundations of modern medicine, botany, pharmacy, perfumery, and chemistry [10]. Medical herbalism, or simply, herbalism or herbology or phytotherapy, is defined by [2] as “the study of herbs and their medicinal uses”. In recent years, interest in herbal medicine has skyrocketed, leading to a greater scientific interest in the medicinal use of plants. © Springer International Publishing Switzerland 2016 T. Okrut et al. (Eds.): NooJ 2015, CCIS 607, pp. 191–201, 2016. DOI: 10.1007/978-3-319-42471-2_17

192

M.A.F. Seideh et al.

Many international studies have shown that plants are capable of treating disease and improving health, often without any significant side effects. In 2009, the World Health Organization (WHO) estimated that 80 % of the world population use herbal medicines as part of their primary health care [14]. Herbalism terminologies are a necessary resource for phytotherapists and free users of medicinal plants. This renewed interest in the natural treatment make the herbalism NE recognition as an interesting field of study. In herbalism, pl