Fitting Aggregation Functions to Data: Part II - Idempotization

The use of supervised learning techniques for fitting weights and/or generator functions of weighted quasi-arithmetic means – a special class of idempotent and nondecreasing aggregation functions – to empirical data has already been considered in a number

PDF / 258,740 Bytes
10 Pages / 439.37 x 666.142 pts Page_size
27 Downloads / 230 Views

DOWNLOAD

REPORT

Faculty of Mathematics and Information Science, Warsaw University of Technology, ul. Koszykowa 75, 00-662 Warsaw, Poland [email protected] 2 School of Information Technology, Deakin University, 221 Burwood Hwy, Burwood, VIC 3125, Australia {gleb,sjames}@deakin.edu.au 3 Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, Poland [email protected]

Abstract. The use of supervised learning techniques for ﬁtting weights and/or generator functions of weighted quasi-arithmetic means – a special class of idempotent and nondecreasing aggregation functions – to empirical data has already been considered in a number of papers. Nevertheless, there are still some important issues that have not been discussed in the literature yet. In the second part of this two-part contribution we deal with a quite common situation in which we have inputs coming from diﬀerent sources, describing a similar phenomenon, but which have not been properly normalized. In such a case, idempotent and nondecreasing functions cannot be used to aggregate them unless proper preprocessing is performed. The proposed idempotization method, based on the notion of B-splines, allows for an automatic calibration of independent variables. The introduced technique is applied in an R source code plagiarism detection system. Keywords: Aggregation functions · Weighted quasi-arithmetic means · Least squares ﬁtting · Idempotence

1

Introduction

Idempotent aggregation functions – mappings like F : [0, 1]n → [0, 1] being nondecreasing in each variable and fulﬁlling F(x, . . . , x) = x for all x ∈ [0, 1] – have numerous applications, including areas like decision making, pattern recognition, and data analysis, compare, e.g., [8,11]. n nFor a ﬁxed n ≥ 2, let w ∈ [0, 1] be a weighting vector, i.e., one with i=1 wi = 1. In the ﬁrst unit [1] of this two-part contribution we dealt with two important practical issues concerning supervised learning of weights of weighted c Springer International Publishing Switzerland 2016 J.P. Carvalho et al. (Eds.): IPMU 2016, Part II, CCIS 611, pp. 780–789, 2016. DOI: 10.1007/978-3-319-40581-0 63

Fitting Aggregation Functions to Data: Idempotization

781

quasi-arithmetic means with a known continuous and strictly monotone gener¯ that is idempotent aggregation functions given for arbitrary ator ϕ : [0, 1] → R, x ∈ [0, 1]n by the formula: n −1 WQAMeanϕ,w (x) = ϕ wi ϕ(xi ) . i=1

First of all, we observed that most often researchers considered an approximate version of weight learning tasks and relied on a linearization of input variables, compare, e.g., [7]. Therefore, we discussed possible implementations of the exact ﬁtting procedure and identiﬁed some cases where linearization leads to solutions of signiﬁcantly worse quality in terms of the squared error between the desired and generated outputs. Secondly, we noted that the computed models may overﬁt a training data set and perform weakly on test and validation samples. Thus, some regularization methods were proposed to overcome th

Data Loading...

Fitting Aggregation Functions to Data: Part II - Idempotization

Recommend Documents

Introduction to Part II

Conclusion to Part II

Introduction to Part II

Appendix to Part II

Data Preparation and Training Part II

Introduction to Part II: Inhabiting

Data Aggregation

Part II

Sheared Suspensions II - Depletion Aggregation

Summary of Part II

Conclusions of Part II

Path Integrals: Part II