Sampling hierarchies of discrete random structures

  • PDF / 592,567 Bytes
  • 17 Pages / 595.276 x 790.866 pts Page_size
  • 5 Downloads / 229 Views

DOWNLOAD

REPORT


Sampling hierarchies of discrete random structures Antonio Lijoi1

· Igor Prünster1

· Tommaso Rigon2

Received: 21 March 2019 / Accepted: 29 June 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Hierarchical normalized discrete random measures identify a general class of priors that is suited to flexibly learn how the distribution of a response variable changes across groups of observations. A special case widely used in practice is the hierarchical Dirichlet process. Although current theory on hierarchies of nonparametric priors yields all relevant tools for drawing posterior inference, their implementation comes at a high computational cost. We fill this gap by proposing an approximation for a general class of hierarchical processes, which leads to an efficient conditional Gibbs sampling algorithm. The key idea consists of a deterministic truncation of the underlying random probability measures leading to a finite dimensional approximation of the original prior law. We provide both empirical and theoretical support for such a procedure. Keywords Bayesian nonparametrics · Discrete random structures · Hierarchical Dirichlet process · Normalized random measures · Pitman–Yor process

1 Introduction When investigating covariate–dependent observations {(X li )i≥1 : l ∈ L} in a Bayesian framework, the standard assumption of exchangeability is not appropriate since it amounts to considering the data as being homogeneous. The covariate l ∈ L is actually a source of heterogeneity that one has to take into account and a different symmetry condition among the data should be specified. Here we focus on the case where the covariate space is finite, i.e. L = {1, . . . , d}, and identifies data that are recorded under d different, though related, experimental conditions. In view of this, a natural dependence structure is implied by partial exchangeability according to which exchangeability holds true within each of Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11222-020-09961-7) contains supplementary material, which is available to authorized users.

B

Antonio Lijoi [email protected] Igor Prünster [email protected] Tommaso Rigon [email protected]

1

Department of Decision Sciences and Bocconi Institute of Data Science and Analytics, Bocconi University, Milan, Italy

2

Department of Statistical Sciences, Duke University, Durham, NC, USA

the d separate groups of observations, but not across them. More formally, let X be the sample space and let X denote its Borel σ -algebra. For the sake of generality, the space X is assumed to be Polish, although in practice one typically has X ⊆ R p . Moreover, PX stands for the space of probability measures on X. The array of X–valued random elements {(X li )i≥1 : l = 1, . . . , d} is partially exchangeable if and only if for any i = 1, . . . , n (l) and any l = 1, . . . , d ind

(X li | p˜l ) ∼ p˜l ,

( p˜ 1 , . . . , p˜ d ) ∼ Q d ,

(1)

for some probability measure Q d on the product spa