DeepECT: The Deep Embedded Cluster Tree

PDF / 2,223,981 Bytes
14 Pages / 595.276 x 790.866 pts Page_size
71 Downloads / 222 Views

DeepECT: The Deep Embedded Cluster Tree Dominik Mautz1 · Claudia Plant2 · Christian Böhm3 Received: 2 April 2020 / Revised: 26 June 2020 / Accepted: 26 June 2020 © The Author(s) 2020

Abstract The idea of combining the high representational power of deep learning techniques with clustering methods has gained much attention in recent years. Optimizing a clustering objective and the dataset representation simultaneously has been shown to be advantageous over separately optimizing them. So far, however, all proposed methods have been using a flat clustering strategy, with the actual number of clusters known a priori. In this paper, we propose the Deep Embedded Cluster Tree (DeepECT), the first divisive hierarchical embedded clustering method. The cluster tree does not need to know the actual number of clusters during optimization. Instead, the level of detail to be analyzed can be chosen afterward and for each sub-tree separately. An optional data-augmentation-based extension allows DeepECT to ignore prior-known invariances of the dataset, such as affine transformations in image data. We evaluate and show the advantages of DeepECT in extensive experiments. Keywords Embedded clustering · Hierarchical clustering · Autoencoder · Deep learning Abbreviations ACC Clustering accuracy AE Autoencoder AE + Complete AE combined with agglomerative clustering with complete-linkage AE + Single AE combined with agglomerative clustering with single-linkage DEC Deep Embedded Cluster algorithm [3] IDEC Improved Deep Embedded Cluster algorithm [5] DeepECT Deep Embedded Cluster Tree DeepECT + Aug DeepECT with the optional augmentation extension DP Dendrogram purity Eq. Equation LP Leaf purity * Dominik Mautz [email protected] Claudia Plant [email protected] Christian Böhm [email protected] 1

LMU München, Munich, Germany

2

Faculty of Computer Science, ds:UniVie, University of Vienna, Vienna, Austria

3

MCML, LMU München, Munich, Germany

NMI Normalized mutual information ReLU Rectified linear unit URL Uniform resource locator

1 Introduction Clustering algorithms are a fundamental tool for data mining tasks. However, of similar importance is the representation of the data to be clustered and this, in turn, depends on the data domain. In the last decade, deep learning techniques have achieved in areas that were previously very challenging for machine learning and data mining methods. These areas include images, graph structures, text, video, and audio. Many of these success stories have been made in the context of supervised learning. Further, neural network-based, unsupervised representation learning has made it possible to embed these challenging domains into spaces more accessible to classical data mining methods. In recent years, the idea of simultaneously optimizing a clustering objective and the dataset representation has gained more traction. In this work, we call these methods either embedded clustering or deep clustering. The combined optimization holds the promise of improved results

Data Loading...

DeepECT: The Deep Embedded Cluster Tree

Recommend Documents

Embedded Cluster Model: Application to Molecular Crystals

Trust-embedded collaborative deep generative model for social recommendation

Interference-aware parallelization for deep learning workload in GPU cluster

Enhancing the performance of decision tree-based packet classification algorithms using CPU cluster

Cluster Aware Deep Dictionary Learning for Single Cell Analysis

Smart Trash Segregator Using Deep Learning on Embedded Platform

Deep learning controller design of embedded control system for maglev train via deep belief network algorithm

Deep learning-based edge caching for multi-cluster heterogeneous networks

Upgrading the Cluster

The biplanar tree graph

The Rubber Tree Genome

The Valuative Tree