A multi-agent-based algorithm for data clustering

  • PDF / 3,182,197 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 9 Downloads / 205 Views

DOWNLOAD

REPORT


REGULAR PAPER

A multi-agent-based algorithm for data clustering Lutiele M. Godois1 · Diana F. Adamatti1

· Leonardo R. Emmendorfer1

Received: 23 October 2019 / Accepted: 28 July 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Clustering algorithms aim to detect groups based on similarity, from a given set of objects. Many clustering techniques have been proposed, most requiring the user to set critical parameters, such as the number of groups. This work presents the implementation and evaluation of a clustering algorithm based on a multi-agent system, which automatically detects the number of groups and the group labels for a given dataset. Groups formed during the clustering process emerge as patterns from the interaction among agents. The proposed algorithm is experimentally validated over benchmark datasets from the literature. The quality of clustering results is computed using seven internal indexes and one external index. Under this methodology, the proposed algorithm is compared to K-means and DBSCAN (density-based spatial clustering of applications with noise). Keywords Data clustering · Multi-agent systems · Cluster validation

1 Introduction The intuitive idea behind the notion of data clustering is the search for groups of objects which share some kind of similarity. The task of clustering is relevant in the context of machine learning due to its ability to reveal useful patterns from datasets. The topic has gained importance over recent years, mostly as a result of successful applications in a wide range of fields, such as biology, medicine, psychology, and image processing. Clustering algorithms partition data objects (entities, instances, observations, units) into a number of clusters (groups, subsets, or categories) [32]. There are a large number of clustering algorithms, which tend to specialize on specific characterizations of data [8,19]. One should notice that the diversity of clustering techniques lead to diverse possible clustering solutions for a single given dataset. This is majorly due to the wide variety of procedures and criteria adopted by the available algorithms. The ad hoc definition of input parameters, such as the desired

B

Diana F. Adamatti [email protected] Lutiele M. Godois [email protected] Leonardo R. Emmendorfer [email protected]

1

Computer Science Center, Federal University of Rio Grande, Rio Grande, RS 96203-900, Brazil

number of groups, also highly affects the clustering results for most algorithms. Therefore, the robustness to the input parameters is a design goal for some novel clustering algorithms, so as the clustering results would be least influenced by input values. In this work, we propose the implementation and evaluation of a clustering algorithm which attempts to approach those concerns by adopting a multi-agent system which selforganizes when searching for clustering solutions. Multi-agent systems are composed of multiple interacting computational elements, known as agents. Each agent has the ability to act auto