Pattern Structures and Concept Lattices for Data Mining and Knowledge Processing

This article aims at presenting recent advances in Formal Concept Analysis (2010-2015), especially when the question is dealing with complex data (numbers, graphs, sequences, etc.) in domains such as databases (functional dependencies), data-mining (local

  • PDF / 123,126 Bytes
  • 5 Pages / 439.37 x 666.142 pts Page_size
  • 19 Downloads / 145 Views

DOWNLOAD

REPORT


Universit´e de Lyon, CNRS, INSA-Lyon, LIRIS, UMR5205, 69621 Lyon, France [email protected] 2 LORIA (CNRS - Inria Nancy Grand Est - Universit´e de Lorraine), B.P. 239, 54506 Vandœuvre-l`es-Nancy, France 3 Universitat Polit`ecnica de Catalunya, 08032 Barcelona, Catalonia, Spain 4 National Research University Higher School of Economics (HSE), Kochnovski pr.3, Moscow 125319, Russia

Abstract. This article aims at presenting recent advances in Formal Concept Analysis (2010-2015), especially when the question is dealing with complex data (numbers, graphs, sequences, etc.) in domains such as databases (functional dependencies), data-mining (local pattern discovery), information retrieval and information fusion. As these advances are mainly published in artificial intelligence and FCA dedicated venues, a dissemination towards data mining and machine learning is worthwhile.

1

Pattern Structures in Formal Concept Analysis

Formal Concept Analysis (FCA) is a branch of applied lattice theory that appeared in the 1980’s [11]. Starting from a binary relation between a set of objects and a set of attributes, formal concepts are built as maximal sets of objects in relation with maximal sets of attributes, by means of derivation operators forming a Galois connection. Concepts form a partially ordered set that represents the initial data as a hierarchy, called the concept lattice. This conceptual structure has proved to be useful in many fields, e.g. artificial intelligence, knowledge management, data-mining and machine learning, morphological mathematics, etc. In particular, several results and algorithms from itemset and association rule mining and rule-based classifiers were already characterized in terms of FCA [17,20]. For example, the set of frequent closed itemsets is an order ideal of a concept lattice; association rules and functional dependencies can be characterized with the derivation operators; jumping patterns were defined as hypotheses, etc, not to mention efficient polynomial-delay algorithms for building all closed itemsets such as CloseByOne [18]. The goal of this communication is to present our recent advances in FCA over the period 2010–2015, especially when the question is dealing with complex data, thanks to the rich formalism of pattern structures [10]. This general approach translates FCA to any partially ordered data descriptions to deal elegantly with c Springer International Publishing Switzerland 2015  A. Bifet et al. (Eds.): ECML PKDD 2015, Part III, LNAI 9286, pp. 227–231, 2015. DOI: 10.1007/978-3-319-23461-8 19

228

M. Kaytoue et al.

non binary, say complex, heterogeneous and structured data. Pattern structures also allow new ways of solving problems in several applications (see next section). The key idea relies on defining so-called similarity operators which induce a semi-lattice on data descriptions. Several alternative attempts were made for defining such semi-lattices on sets of graphs and logical formulas (see, e.g., the works of Chaudron&Maille, Ferr´e&Ridoux, Polaillon&Brito cited in [1