Music generation with variational recurrent autoencoder supported by history

PDF / 1,296,499 Bytes
7 Pages / 595.276 x 790.866 pts Page_size
94 Downloads / 203 Views

Music generation with variational recurrent autoencoder supported by history Ivan P. Yamshchikov1 · Alexey Tikhonov2 Received: 3 February 2020 / Accepted: 14 October 2020 © The Author(s) 2020 OPEN

Abstract A new artificial neural network architecture that helps generating longer melodic patterns is introduced alongside with methods for post-generation filtering. The proposed approach, called variational autoencoder supported by history, is based on a recurrent highway gated network combined with a variational autoencoder. The combination of this architecture with filtering heuristics allows the generation of pseudo-live, acoustically pleasing, melodically diverse music. Keywords Music generation · Discrete sequences generation · Artificial intelligence Mathematics Subject Classification 68T50 · 68T99

1 Introduction The rapid progress of artificial neural networks is gradually erasing the border between the arts and the sciences. A significant number of results demonstrate how areas previously regarded as entirely human due to their creative or intuitive nature are now being opened up for algorithmic approaches [24]. Music is one of these areas. Indeed, there were a number of attempts to automate the process of music composition long before the era of artificial neural networks. Well-developed theory of music inspired a number of heuristic approaches to automated music composition. The earliest idea that we know of dates as far back as the nineteenth century, see [15]. In the middle of the twentieth century, a Markov-chain approach for music composition was developed in [8]. Despite these advances, Lin and Tegmark [14] have demonstrated that music, as well as some other types of human-generated discrete time series, tends to have long-distance dependencies that cannot be captured by models based on Markov chains. Recurrent neural networks (RNNs), on the other hand, are better able

to process data series with longer internal dependencies [21], such as sequences of notes in a tune [1]. Indeed, a variety of different recurrent neural networks such as hierarchical RNN, gated RNN, long short-term memory (LSTM) network, and recurrent highway network were successfully used for music generation in [4–6, 10, 20, 28] or [23]. Yang et al. [27] use generative adversarial networks for the same task. For a broad overview of generative models for music, we address the reader to [3]. The similarity between the problem setup for noteby-note music generation and the setup used in the word-by-word generation of text makes it reasonable to review some of the methods that proved themselves useful in generative natural language processing tasks. We would like to focus on a variational autoencoder (VAE) proposed in [2, 18]. A VAE makes assumptions concerning the distribution of latent variables and applies a variational approach for latent representation learning. This yields an additional loss component and a specific training algorithm called Stochastic Gradient Variational Bayes (SGVB), see [16] as well as [11]. Thus, a generat

Data Loading...

Music generation with variational recurrent autoencoder supported by history

Recommend Documents

Learning Disentangled Representations with Attentive Joint Variational Autoencoder

VAEPP: Variational Autoencoder with a Pull-Back Prior

Multi-Adversarial Variational Autoencoder Nets for Simultaneous Image Generation and Classification

Plan-CVAE: A Planning-Based Conditional Variational Autoencoder for Story Generation

A Variational Autoencoder Approach for Speech Signal Separation

Performing Music History Musicians Speak First-Hand about Music Hist

Anomaly detection in facial skin temperature using variational autoencoder

Goal-Conditioned Variational Autoencoder Trajectory Primitives with Continuous and Discrete Latent Codes

Variational Autoencoder with Global- and Medium Timescale Auxiliaries for Emotion Recognition from Speech

From artificial neural networks to deep learning for music generation: history, concepts and trends

Time series forecasting and text generation with recurrent neural networks

Algorithmic Composition Paradigms of Automated Music Generation