Ontology Based Web Mining for Information Gathering

There exists a gap between Web mining and the effectiveness of using Web data. The main reason is that we cannot simply utilize and maintain the discovered knowledge using the traditional knowledge-based techniques due to the huge amount of discovered pat

  • PDF / 743,070 Bytes
  • 22 Pages / 430 x 660 pts Page_size
  • 61 Downloads / 224 Views

DOWNLOAD

REPORT


Abstract. There exists a gap between Web mining and the effectiveness of using Web data. The main reason is that we cannot simply utilize and maintain the discovered knowledge using the traditional knowledge-based techniques due to the huge amount of discovered patterns, many noise in discovered patterns and even some useful patterns with uncertainties. In this chapter we discuss ontologybased problem solving approaches for building a bridge between Web mining and the effectiveness of using Web data, which tend to automatically construct and maintain ontologies for representations, application and updating of discovered knowledge. We mainly discuss two models: the pattern taxonomy model and the ontology mining model. The former uses the up-to-date techniques of association mining and latter uses granule mining that directly discovers granules rather than patterns.

1 Introduction We have witnessed an explosive growth of the available information on the Web over the last decade. However, there are two fundamental issues regarding the effectiveness of Web information gathering: mismatch and overload. The mismatch means some useful and interesting data has been missed out, whereas, the overload means some gathered data is not what users want. Although information retrieval (IR) based techniques have touched fundamental issues [3,15], IR-based systems neither explicitly describe how the systems act like users nor discover interesting and useful knowledge from very large datasets to answer what users really want. This issue has challenged the artificial intelligence (AI) community to address “what has information gathering to do with AI” [20,25]. For a short while, many intelligent information agents have been presented for this challenge. Unfortunately, information agents can only show us the architectures of Web information gathering [22,23,24,27]. They have not provided more significant contributions for finding interesting and useful knowledge from Web data. Web Intelligence (WI) is an alternative way that can provide a new thought for solving this problem [67,68,66,44,72,74]. Currently, there are three main directions for the effectiveness of using Web data in WI: Web mining, adaptive Web systems and information foraging agents [71]. N. Zhong et al. (Eds.): WImBI 2006, LNAI 4845, pp. 406–427, 2007. c Springer-Verlag Berlin Heidelberg 2007 

Ontology Based Web Mining for Information Gathering

407

The application of data mining techniques to Web data, called Web mining, is used to discover knowledge (patterns) from Web data. Currently, a Web mining system can be viewed as the use of data mining techniques to automatically retrieve, extract, generalize, and analyze information on the Web [7] [53]. Web mining can be classified into four categories: Web usage mining, Web structure mining, Web user profile mining, and Web content mining [14,47,28,59]. An adaptive Web system [56] is able to identify the interrelationships among distributed electronic information on the Web based on the discovery of Web mining [5]. The