Latent Timbre Synthesis

PDF / 2,435,143 Bytes
18 Pages / 595.276 x 790.866 pts Page_size
57 Downloads / 199 Views

(0123456789().,-volV)(0123456789(). ,- volV)

S. I : NEURAL NETWORKS IN ART, SOUND AND DESIGN

Latent Timbre Synthesis Audio-based variational auto-encoders for music composition and sound design applications Kıvanç Tatar1

•

Daniel Bisig2 • Philippe Pasquier1

Received: 26 June 2020 / Accepted: 5 October 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract We present the Latent Timbre Synthesis, a new audio synthesis method using deep learning. The synthesis method allows composers and sound designers to interpolate and extrapolate between the timbre of multiple sounds using the latent space of audio frames. We provide the details of two Variational Autoencoder architectures for the Latent Timbre Synthesis and compare their advantages and drawbacks. The implementation includes a fully working application with a graphical user interface, called interpolate_two, which enables practitioners to generate timbres between two audio excerpts of their selection using interpolation and extrapolation in the latent space of audio frames. Our implementation is open source, and we aim to improve the accessibility of this technology by providing a guide for users with any technical background. Our study includes a qualitative analysis where nine composers evaluated the Latent Timbre Synthesis and the interpolate_two application within their practices. Keywords Audio synthesis Neural networks Signal processing Computer assisted music composition

1 Introduction Modern sound synthesizers come loaded with many parameters, with very large nonlinear, non-modal, search spaces. This richness comes to the detriment of searchability as one cannot easily or efficiently find a particular sound, particular sonic textures, or generate a transition between two textures. Consequently, sound designers and musicians most often rely on audio samples (of instruments, sound effects) and their manipulation rather than the more flexible sound synthesis approach of these sounds and their sonic surroundings. In previous work on synthesizer preset generation [35], we demonstrated how, given a & Kıvanc¸ Tatar [email protected] Daniel Bisig [email protected] Philippe Pasquier [email protected] 1

Simon Fraser University, Vancouver, BC, Canada

2

Zurich University of the Arts, Zurich, Switzerland

target sample, PresetGen can find a preset that generates the closest to the sample. In this work, we investigate a new DL based-method by which a synthesizer model is trained on selected audio textures, allowing musicians and sound designers to achieve their synthesis goals by exploring a sonic space, with interpolation and extrapolation between sonic textures. The rise in popularity of deep learning (DL) architectures has led to promising new research using deep learning (DL) for musical applications, audio transformation and sound synthesis [2]. The demand for sound synthesizers is projected to grow at an accelerated rate in the next five years [39], and in parallel, there is increasing interest in flexible, versatile, yet controllab

Data Loading...

Latent Timbre Synthesis

Recommend Documents

Timbre: Acoustics, Perception, and Cognition

The Timbre Perception Test (TPT): A new interactive musical assessment tool to measure timbre perception ability

Latent Infection

Latent Learning

Latent Variable

Latent Variable

Latent Fingerprint

Latent Fingerprint Experts

Quest for Latent Habitats

Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony

Latent Viral Infection

Latent Semantic Indexing