User-System Interaction for Redundancy-Free Knowledge Discovery in Data

A classical limit of association rule at the decider's point of view is in the combinatorial nature of the association rules, resulting in numerous rules. As the overall quality of an association rule set can be considered as insight of the studied domain

PDF / 1,179,092 Bytes
17 Pages / 595.276 x 841.89 pts (A4) Page_size
21 Downloads / 281 Views

DOWNLOAD

REPORT

ion The amount of collected data grows continuously. Decision tasks performed must take this growth into account to deal with prediction, action evaluation or validation, in the context of a large variety of application fields like management, profit optimization or analysis. The KDD (Knowledge Discovery in Databases) area scopes this range of applications in the goal of providing automated tools and adapted data representations to help an expert user in finding the evidences needed for the decision tasks. This assumes a human centered KDD process. As a human centered process involving automated procedures, it needs a targetted problem representations that are both realistic from the user’s point of view and computable from a machine point of view. R. Lehn et al.: User-System Interaction for Redundancy-Free Knowledge Discovery in Data, Studies in Computational Intelligence (SCI) 127, 463–479 (2008) www.springerlink.com © Springer-Verlag Berlin Heidelberg 2008

464

R. Lehn et al.

Among KDD techniques, association rules [2] allow the capture and the representation of implicative patterns that tolerate a small set of counterexamples —e.g. birds that cannot fly or sport cars that are not red. Association rules can be enhanced with statistical evaluations and filters such as the Intensity of Implication family of indices. Association rule discovery is motivated by the exploitation of operational databases to discover a new knowledge, that was unknown before the discovery and that is potentially exploitable in a decision making process [19]. Many performant algorithms have been published to optimize the association rules search [8, 16] but they mainly focus on algorithmic optimization rather than on knowledge usability. One of the fundamental hypothesis of association rule discovery is that the user does not specify the goal of the search. Because of the intrinsically combinatorial nature of the search and the lack of the goals, the classical use of these algorithms, chaining data selection, data formatting, frequent sets induction, rules calculation and rule presentation to the user, generally outputs quantities of rules, without order of any kind, which is in contradiction with the principle of knowledge readability and usability for a decision process. Experiments using a direct application of association rules algorithms like A Priori, resulted in thousands of rules. We can then seriously contest the quality of the vision of the studied domain provided by the association rules to the user if he has to explore thousands of rules. We can contest as well the quality of the induction itself if the energy that the user has to involve to interpret the association rules is nearly the same as the energy he would have to deploy to get the same domain understanding by directly browsing the database. A classical answer to this problem is to set high thresholds on quality indices that evaluate individual rules, to eliminate the least pertinent rules as measured by these indices. But there are cases where this strategy cannot be

Data Loading...

User-System Interaction for Redundancy-Free Knowledge Discovery in Data

Recommend Documents

Knowledge Discovery in Spatial Data

Data Mining and Knowledge Discovery for Big Data Methodologies, Chal

Correction to: Knowledge Discovery and Data Mining

Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data

Data Mining A Knowledge Discovery Approach

Social Network Data Mining and Knowledge Discovery

Data Analysis, Machine Learning and Knowledge Discovery

Knowledge Discovery from Complex High Dimensional Data

Effective Knowledge Discovery Using Data Mining Algorithm

Methodologies for Knowledge Discovery and Data Mining Third Pacific-

Advanced Methods for Knowledge Discovery from Complex Data

Trends and Applications in Knowledge Discovery and Data Mining PAKDD