Hyper Autoencoders

  • PDF / 3,896,768 Bytes
  • 19 Pages / 439.37 x 666.142 pts Page_size
  • 95 Downloads / 180 Views

DOWNLOAD

REPORT


Hyper Autoencoders Derya Soydaner1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract We introduce the hyper autoencoder architecture where a secondary, hypernetwork is used to generate the weights of the encoder and decoder layers of the primary, actual autoencoder. The hyper autoencoder uses a one-layer linear hypernetwork to predict all weights of an autoencoder by taking only one embedding vector as input. The hypernetwork is smaller and as such acts as a regularizer. Just like the vanilla autoencoder, the hyper autoencoder can be used for unsupervised or semi-supervised learning. In this study, we also present a semisupervised model using a combination of convolutional neural networks and autoencoders with the hypernetwork. Our experiments on five image datasets, namely, MNIST, Fashion MNIST, LFW, STL-10 and CelebA, show that the hyper autoencoder performs well on both unsupervised and semi-supervised learning problems. Keywords Autoencoder · Hypernetwork · Image processing · Deep learning

1 Introduction In recent years, deep neural networks have led to significant breakthroughs in a variety of application areas from computer vision to language translation to game playing. Such networks are typically composed of many layers of processing units and these units arranged in different layers of abstraction learn automatically the best features to handle the task. With the availability of very large datasets and high computing power through parallel architectures, such networks nowadays are used successfully more and more frequently. Still, training a deep neural network has its problems: When the network is big, there are more free parameters and one has to use regularization to make sure that the network does not overfit. This has led to methods such as dropout, weight decay, early stopping, and so on. A related approach is the convolutional architecture where a unit is connected to only a small subset of units in the preceding layer, which together with weight sharing, leads to a significant decrease in the total number of free parameters. The autoencoder is one of the earliest neural network architectures, originally named the autoassociator by [8], and since then has been successfully used in a variety of applications, for unsupervised or semi-supervised tasks, or as a preprocessing stage for supervised tasks.

B 1

Derya Soydaner [email protected] Department of Statistics, Mimar Sinan Fine Arts University, ˙Istanbul, Turkey

123

D. Soydaner

It is composed of an encoder and decoder, and when these contain many layers each the autoencoder may suffer from the same problems as any deep network. Recently, the hypernetwork where a simpler hyper neural network is trained to generate the weights of the larger actual network is presented [12]. In this paper, we propose the hyper autoencoder architecture where the weights of the encoder and decoder layers are predicted by a hypernetwork. The conventional hypernetwork is defined as a two-layer network which takes different embe