The Similarity Between Dissimilarities

When characterizing teams of people, molecules, or general graphs, it is difficult to encode all information using a single feature vector only. For these objects dissimilarity matrices that do capture the interaction or similarity between the sub-element

  • PDF / 344,297 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 3 Downloads / 237 Views

DOWNLOAD

REPORT


2

Pattern Recognition Laboratory, Delft University of Technology, Delft, The Netherlands [email protected] Biomedical Imaging Group Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands 3 Transparency Lab, Amsterdam, The Netherlands transparencylab.com

Abstract. When characterizing teams of people, molecules, or general graphs, it is difficult to encode all information using a single feature vector only. For these objects dissimilarity matrices that do capture the interaction or similarity between the sub-elements (people, atoms, nodes), can be used. This paper compares several representations of dissimilarity matrices, that encode the cluster characteristics, latent dimensionality, or outliers of these matrices. It appears that both the simple eigenvalue spectrum, or histogram of distances are already quite effective, and are able to reach high classification performances in multiple instance learning (MIL) problems. Finally, an analysis on teams of people is given, illustrating the potential use of dissimilarity matrix characterization for business consultancy.

1

Introduction

Consider the problem of evaluating and improving performances of teams in organizations based on the employee responses to questionnaires. The teams differ in size, and the roles of employees may be different for every organization. A key question for an organizations top management is how to support the autonomy of these teams while still keeping an eye on the overall process and the coherency of the teams performance. Assuming a span of control of 10–15 direct reports for an average manager, a middlesize organization may easily comprise of hundreds of teams. So, pattern recognition in organizational development may supply fundamentally important information of how similar – or dissimilar – teams are [1,15,20]. A possible solution is to focus at the diversity within a team – is there a large group of people who are all doing a similar job, or are there some isolated groups of people who are doing very different from the rest? Identifying such groups – clusters of employees – would help to compare the organizational structures on a higher level. More formally, in this paper we focus on comparing sets (teams) of different samples (employees), residing in different feature spaces (evaluation questions). Comparing the team structures would be equivalent to comparing similarity c Springer International Publishing AG 2016  A. Robles-Kelly et al. (Eds.): S+SSPR 2016, LNCS 10029, pp. 84–94, 2016. DOI: 10.1007/978-3-319-49055-7 8

The Similarity Between Dissimilarities

85

matrices, with each similarity matrix originating from a single team. Comparing similarities alleviates the problem of different feature spaces, yet is still not trivial because the sets can be of different sizes, and there are no natural correspondences between the samples. Comparing distance matrices has links with comparing graph structures: a distance matrix between N samples can be seen as a fully connected graph with N nodes, where the nodes are unlabeled and the edg