Sign Constrained Rectifier Networks with Applications to Pattern Decompositions

In this paper we introduce sign constrained rectifier networks (SCRN), demonstrate their universal classification power and illustrate their applications to pattern decompositions. We prove that the proposed two-hidden-layer SCRN, with sign constraints on

  • PDF / 275,834 Bytes
  • 14 Pages / 439.37 x 666.142 pts Page_size
  • 97 Downloads / 182 Views

DOWNLOAD

REPORT


School of Computer Science and Software Engineering, The University of Western Australia, Crawley, WA 6009, Australia {senjian.an,mohammed.bennamoun,ferdous.sohel}@uwa.edu.au, [email protected] 2 School of Electrical, Electronic and Computer Engineering, The University of Western Australia, Crawley, WA 6009, Australia [email protected] Abstract. In this paper we introduce sign constrained rectifier networks (SCRN), demonstrate their universal classification power and illustrate their applications to pattern decompositions. We prove that the proposed two-hidden-layer SCRN, with sign constraints on the weights of the output layer and on those of the top hidden layer, are capable of separating any two disjoint pattern sets. Furthermore, a two-hidden-layer SCRN of a pair of disjoint pattern sets can be used to decompose one of the pattern sets into several subsets so that each subset is convexly separable from the entire other pattern set; and a single-hidden-layer SCRN of a pair of convexly separable pattern sets can be used to decompose one of the pattern sets into several subsets so that each subset is linearly separable from the entire other pattern set. SCRN can thus be used to learn the pattern structures from the decomposed subsets of patterns and to analyse the discriminant factors of different patterns from the linear classifiers of the linearly separable subsets in the decompositions. With such pattern decompositions exhibiting convex separability or linear separability, users can also analyse the complexity of the classification problem, remove the outliers and the non-crucial points to improve the training of the traditional unconstrained rectifier networks in terms of both performance and efficiency. Keywords: Rectifier neural network

1

· Pattern decomposition

Introduction

Deep rectifier networks have achieved great success in object recognition [4,8,10,18], face verification [14,15], speech recognition ([3,6,12] and handwritten digit recognition [2]. However, the lack of understanding of the roles of the hidden layers makes the deep learning network difficult to interpret for tasks of discriminant factor analysis and pattern structure analysis. Towards a clear understanding of the success of the deep rectifier networks, a recent work [1] provides a constructive proof for the universal classification power of two-hiddenlayer rectifier networks. For binary classification, the proof uses the first hidden c Springer International Publishing Switzerland 2015  A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 546–559, 2015. DOI: 10.1007/978-3-319-23528-8 34

Sign Constrained Rectifier Networks with Applications

547

layer to make the pattern sets convexly separable. The second hidden layer is then used to achieve linear separability, and finally a linear classifier is used to separate the patterns. Although this strategy can be used in constructive proofs, it cannot be used to analyse the learnt rectifier network since it might not be verified in the empirical learning from data. Fortunately,