Dynamics of a Bayesian Hyperparameter in a Markov Chain

The free energy principle which underlies active inference attempts to explain the emergence of Bayesian inference in stochastic processes under the assumption of (non-equilibrium) steady state distributions. We contribute a study of the dynamics of an ex

  • PDF / 281,735 Bytes
  • 7 Pages / 439.37 x 666.142 pts Page_size
  • 22 Downloads / 224 Views

DOWNLOAD

REPORT


Abstract. The free energy principle which underlies active inference attempts to explain the emergence of Bayesian inference in stochastic processes under the assumption of (non-equilibrium) steady state distributions. We contribute a study of the dynamics of an exact Bayesian inference hyperparameter embedded in a Markov chain that infers the dynamics of an observed process. This system does not have a steadystate but still contains exact Bayesian inference. Our study may contribute to future generalizations of the free energy principle to non-steady state systems. Our treatment uses well-known constructions in Bayesian inference. The main contribution is that we take a different perspective than that of standard treatments. We are interested in how the dynamics of Bayesian inference look from the outside.

Keywords: Free energy principle blankets · Bayesian inference

1

· Active inference · Markov

Introduction

One of the most fundamental components of the free energy principle is the approximate Bayesian inference lemma [1]. It claims to provide a sufficient condition for (possibly approximate) Bayesian inference to occur within an ergodic multivariate Markov process. The condition is that there is a partitioning of the variables into internal, active, sensory, and external variables such that the steady-state distribution factorizes in a particular way. If we write μ for internal, a for active, s for sensory, η for external variables and p∗ for the steady state density then the required factorization is the conditional independence relation p∗ (μ, η|s, a) = p∗ (μ|s, a)p∗ (η|s, a).

(1)

This means that (S, A) form a Markov blanket for μ and also for η. However, Bayesian inference can also happen inside processes that don’t have steadystate densities. We will illustrate this with two examples below. This explicitly shows that ergodicity and the corresponding Markov blanket condition are only sufficient for Bayesian inference and not necessary. c Springer Nature Switzerland AG 2020  T. Verbelen et al. (Eds.): IWAI 2020, CCIS 1326, pp. 35–41, 2020. https://doi.org/10.1007/978-3-030-64919-7_5

36

M. Biehl and R. Kanai

Often, the dynamics of the hyperparameters1 of Bayesian inference are relegated to the background and the focus is on how to compute posteriors for a given hyperparameter or prior. The embedding of both the observed process as well as the hyperparameter into a Markov chain converts standard results into a setting very similar to that of the free energy principle in [1]. The differences are that we have a discrete countably infininte state space instead of a continuous one, discrete instead of continuous time, and in the current version no actions. We will include actions into our setting in future work. Methods for transitioning to continuous systems are well studied so that we are optimistic that insights from the discrete setting can eventually be carried over to the continuous domain. In general we think that the method of embedding Bayesian inference and possibly also approximate Bayesian inference