Question Answering
Question Answering (QA) is a task that aims at finding a precise answer to a specific user question. This task is significantly challenging because both the question and the answer are formulated in natural language. For this reason, in order to build an
- PDF / 1,080,659 Bytes
- 36 Pages / 439.36 x 666.15 pts Page_size
- 48 Downloads / 219 Views
Question Answering Yassine Benajiba, Paolo Rosso, Lahsen Abouenour, Omar Trigui, Karim Bouzoubaa, and Lamia Belguith
11.1 Introduction The question answering (QA) task has been created to satisfy a specific need of information requested by users who are looking to answer a specific question. The final goal is the ability to automatically parse the available data, extract and validate the potential answers regardless of whether the question is simple, such as: “To what family of languages does Hebrew belong?” or one that needs a deeper analysis of the data, such as: “What political events succeeded the Tunisian revolution?” It is important to note that one of the most important features of QA systems is their ability to extract the necessary information from natural language documents. This feature lowers the cost significantly because the creation and update of databases that encompass all the knowledge in a structured fashion has proved to be practically impossible. The use of natural language processing (NLP) techniques does not come at no cost because they:
Y. Benajiba () Thomson Reuters, 3 Times Square, New York, NY, USA e-mail: [email protected] P. Rosso Pattern Recognition and Human Language Technology (PRHLT) Research Center, Universitat Politécnica de Valéncia, Valencia, Spain e-mail: [email protected] L. Abouenour • K. Bouzoubaa Mohamed V-Agdal University, Rabat, Morocco e-mail: [email protected]; [email protected] O. Trigui • L. Belguith ANLP Research Group-MIRACL Laboratory, University of Sfax, Sfax, Tunisia e-mail: [email protected]; [email protected] I. Zitouni (ed.), Natural Language Processing of Semitic Languages, Theory and Applications of Natural Language Processing, DOI 10.1007/978-3-642-45358-8__11, © Springer-Verlag Berlin Heidelberg 2014
335
336
Y. Benajiba et al.
• Come with an intrinsic error penalty as they almost always rely on statistical models and rule-based modules that never perform at 100 %; • Require a significant amount of training data to build the underlying statistical models; and • Tend to resort to language-dependant resources and modules which need to be built from scratch for each different language. In this chapter, we are concerned with the performance of such a system when dealing with a Semitic language. Such a language, as introduced in Chap. 1, exhibits a set of morphological and syntactic properties that need to be taken into consideration and which we will discuss in this chapter. The interest of the NLP research community in QA for Semitic languages is very recent. In CLEF 2012, the QA4MRE task was the first competition to include a Semitic language, i.e. Arabic. Therefore, there is not enough literature on Semitic languages QA per se and we will borrow some insights from other NLP tasks that, we know, can significantly help to understand the intricacies of Semitic languages QA. Thereafter, we will look into QA research works that have been conducted on Arabic to further understand what is required to build a success
Data Loading...