Predicting Quality-Assured Consensual Answers in Community-Based Question Answering Systems

Although several community-based question answering or CQA systems have been successful at encouraging vast numbers of users to ask and answer questions, and leading to unanticipated explosion of community knowledge, the user-generated contents are confro

  • PDF / 129,913 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 34 Downloads / 160 Views

DOWNLOAD

REPORT


Abstract Although several community-based question answering or CQA systems have been successful at encouraging vast numbers of users to ask and answer questions, and leading to unanticipated explosion of community knowledge, the user-generated contents are confronted with poor quality and untrustworthy problems, while the CQA community deals with conflicts occurred during the question answering process. To tackle such problems, this paper presents a novel approach to predict the best answer as quality-assured consensual answer by simultaneously concerning the content quality and group preference. A set of important features of the answer and its interrelated components as well as social interaction are identified and used to model the predicting function by applying binary logistic regression method. In contrast to the voting-based CQA systems, the proposed model evaluates group preference based on the content quality and community agreement. By training the proposed model using the defined features, the results show that the proposed approach is efficient and outperforms the voting method.



Keywords Community-based Question Answering systems Community deliberation User-generated content Binary logistic regression





1 Introduction Community-based Question Answering (CQA) systems have attracted a large number of people who participate in knowledge creation and sharing. Many CQA websites such as Yahoo! Answers1 and Stack Overflow2 allow their users to ask 1

http://answers.yahoo.com. http://stackoverflow.com.

2

K. Maleewong (✉) School of Information Technology, Shinawatra University, Sam Khok, Pathum Thani, Thailand e-mail: [email protected] © Springer International Publishing Switzerland 2016 P. Meesad et al. (eds.), Recent Advances in Information and Communication Technology 2016, Advances in Intelligent Systems and Computing 463, DOI 10.1007/978-3-319-40415-8_12

117

118

K. Maleewong

questions and propose answers to solve the questions. In addition, several CQA websites such as Apple discussion forum,3 WordReference forum,4 and Java forum5 provide a mechanism to support the community deliberation by motivating members to submit an idea or argument to support or oppose a particular answer based on his/her opinion. This deliberation results in a massive volume of the usergenerated content or UGC [1]. However, the quality and reliability of the user-generated contents in the CQA systems vary drastically due to the different skills and expertise of the users [2]. Therefore, the contents in CQA websites result in poor quality and untrustworthy information. Moreover, many CQA websites have no explicit mechanism to discover the best answer among various candidate answers, while some websites (e.g., Yahoo! Answer and Stack Overflow) applies voting system without community deliberation to identify the best answer. Unfortunately, not all votes are reliable due to the dramatic increase of vote spam phenomenon [3, 4], and the voting system obstructs a user who wants to further develop a proposed answer or search f