Learning Web Request Patterns

Most requests on the Web are made on behalf of human users, and like other human-computer interactions, the actions of the user can be characterized by identifiable regularities. Much of these patterns of activity, both within a user and between users, ca

  • PDF / 2,932,712 Bytes
  • 25 Pages / 439.37 x 666.142 pts Page_size
  • 85 Downloads / 239 Views

DOWNLOAD

REPORT


Summary. Most requests on the Web are made on behalf of human users, and like other human-computer interactions, the actions of the user can be characterized by identifiable regularities. Much of these patterns of activity, both within a user and between users, can be identified and exploited by intelligent mechanisms for learning Web request patterns. Our focus is on Markov-based probabilistic techniques, both for their predictive power and for their popularity in Web modeling and other domains. Although history-based mechanisms can provide strong performance in predicting future requests, performance can be improved by including predictions from additional sources. In this chapter we review the common approaches to learning and predicting Web request patterns. We provide a consistent description of various algorithms (often independently proposed) and compare performance of those techniques on the same data sets. We also discuss concerns for accurate and realistic evaluation of these techniques.

1 Introduction Modeling user activities on the Web has value both for content providers and consumers. Consumers may appreciate better responsiveness as a result of precalculating and of preloading content into a local cache in advance of their requests. A user requesting content that can be served by the cache is able to avoid the delays inherent in the Web, such as congested networks and slow servers. Additionally, consumers may find adaptive and personalized Web sites that can make suggestions and improve navigation to be useful. Likewise, the content provider will appreciate the insights that modeling can provide and the financial benefits of a happier consumer that gets the desired information even faster. Most requests on the Web are made on behalf of human users, and like other human-computer interactions, the actions of the user can be characterized as having identifiable regularities. Much of these patterns of activity, both within a user and between users, can be identified and exploited by intelligent mechanisms for learning Web request patterns. Prediction here is different from what data mining approaches do with Web logs. We wish to build a (relatively) concise model ofthe user so as to be able to dynamically

M. Levene et al., Web Dynamics © Springer-Verlag Berlin Heidelberg 2004

436

B.D. Davison

predict the next action(s) that the user will take. Data mining of Web logs, in contrast. is typically concerned with characterizing the user, finding common attributes of classes of users, and predicting future actions (such as purchases) without the concern for interactivity or immediate benefit (e.g., see the KDD Cup 2000 competition [8]). Therefore we might consider the application of machine learning techniques 1441 to the problem of Web request sequence prediction. In particular, we wish to be ahle to predict the next Web page that a user will select. This chapter will demonstrate the use of machine learning models on real-world traces with predictive accuracies of 12-50% or better, depending on the trac