DKDD_C: A Clustering-Based Approach for Distributed Knowledge Discovery

In this paper, we address the problem of knowledge discovery. Several approaches have been proposed in this field. However, existing approaches generate a huge number of association rules that are difficult to exploit and assimilate. Moreover, they have n

PDF / 2,653,527 Bytes
11 Pages / 439.37 x 666.142 pts Page_size
26 Downloads / 277 Views

DOWNLOAD

REPORT

Abstract. In this paper, we address the problem of knowledge discovery. Several approaches have been proposed in this ﬁeld. However, existing approaches generate a huge number of association rules that are difﬁcult to exploit and assimilate. Moreover, they have not been proven themselves in a distributed context. As contribution, we propose, in this paper, DKDD_C, a new Distributed Knowledge Discovery approach. Exploiting, KDD based on data classiﬁcation, we propose to give the choice to the user, either to generate Meta-Rules (rules between classes arising of preliminary data classiﬁcation), or to generate classical Rules between distributed data. DKDD_C took place in both local and global processes. We prove that our solution minimizes the number of distributed generated association rules and then, offer a better interpretation of the data and optimization of the execution time. This approach has been validated by the implementation of a user-friendly platform as an extension of the Weka platform for the support of Distributed KDD. Keywords: Distributed knowledge discovery Mining association rules Distributed database Clustering Weka plateform extension

1 Introduction Nowadays, our ability to collect and store data from any type exceeds our possibilities of analysis, synthesis and Knowledge Discovery in Data (KDD). However, the performance of conventional centralized approaches degrade when the size of the processed data increases, in terms of execution time and memory space, hence we note the emergence towards the Distributed Knowledge Discovery (DKDD). Several approaches and tools have been proposed in this context. Through our study, we found that these theoretical and practical approaches have different limits: • Theoretically, DKDD algorithms generate a huge number of association rules that are difﬁcult to exploit and assimilate. • Practically, existing tools (1) support only some KDD algorithm that generates a large number of association rules that are difﬁcult to assimilate (2) tools have not © Springer International Publishing Switzerland 2016 Y. Tan et al. (Eds.): ICSI 2016, Part II, LNCS 9713, pp. 187–197, 2016. DOI: 10.1007/978-3-319-41009-8_20

188

M. Bouraoui et al.

been proven themselves in a distributed context. (3) Are applied only to one restricted type of data. We propose, in this paper, DKDD_C, a distributed knowledge discovery approach based on classiﬁcation, which minimizes the number of distributed generated association rules and then offer a better interpretation of the data and optimized both the space memory and the execution time. By exploiting, KDD based on data classiﬁcation, we propose to give the choice to the user, either to generate Meta-Rules (rules between classes arising of preliminary data classiﬁcation), or to generate Rules between distributed data without preliminary classiﬁcation. This approach has been validated by the implementation of a user-friendly plat-form as an extension of the Weka platform for the support of DKDD. This paper is organized as follows: S

Data Loading...

DKDD_C: A Clustering-Based Approach for Distributed Knowledge Discovery

Recommend Documents

Data Mining A Knowledge Discovery Approach

Data privacy-preserving distributed knowledge discovery based on the blockchain

Knowledge Discovery, Knowledge Engineering and Knowledge Management

Knowledge Discovery, Knowledge Engineering and Knowledge Management

Knowledge Discovery, Knowledge Engineering and Knowledge Management

Swarm-Based Cluster Analysis for Knowledge Discovery

Discovery Approach

Geographic Knowledge Discovery

Knowledge Discovery for Business Information Systems

A virtual mart for knowledge discovery in databases

A Blockchain Based Distributed Storage System for Knowledge Graph Security

Knowledge Discovery from Legal Databases