Branching Processes and Their Applications in the Analysis of Tree Structures and Tree Algorithms

We give a partial overview of some results from the rich theory of branching processes and illustrate their use in the probabilistic analysis of algorithms and data structures. The branching processes we discuss include the Galton-Watson process, the bran

  • PDF / 5,994,828 Bytes
  • 66 Pages / 439.37 x 666.142 pts Page_size
  • 32 Downloads / 195 Views

DOWNLOAD

REPORT


Summary. We give a partial overview of some results from the rich theory of branching processes and illustrate their use in the probabilistic analysis of algorithms and data structures. The branching processes we discuss include the GaltonWatson process, the branching random walk, the Crump-Mode-Jagers process, and conditional branching processes. The applications include the analysis of the height of random binary search trees, random m-ary search trees, quadtrees, union-find trees, uniform random recursive trees and plane-oriented recursive trees. All these trees have heights that grow logarithmically in the size of the tree. A different behavior is observed for the combinatorial models of trees, where one considers the uniform distribution over all trees in a certain family of trees. In many cases, such trees are distributed like trees in a Galton-Watson process conditioned on the tree size. This fact allows us to review Cayley trees (random labeled free trees), random binary trees, random unary-binary trees, random oriented plane trees, and indeed many other species of uniform trees. We also review a combinatorial optimization problem first suggested by Karp and Pearl. The analysis there is particularly beautiful and shows the flexibility of even the simplest branching processes.

1. Branching Processes 1.1 Branching Processes Around 1873, Galton and Watson came up with a model for explaining the disappearance of certain family names in England (see the historical survey by Kendall, 1966). Their model, now known as the Galton-Watson process, is extremely simple: in a population, we begin with one pater familias, or root. The root has Zl children, where Zl has a fixed distribution (the reproduction distribution): it is convenient to let Z denote a prototypical random variable with this distribution, and to set Pi

= Pr( Z = i)

, i~0.

Each child in turn reproduces independently according to the same distribution, and so forth. This leads to a random tree, the Galton-Watson tree, and a random process, the Galton-Watson process. Let Zi denote the number of M. Habib et al. (eds.), Probabilistic Methods for Algorithmic Discrete Mathematics © Springer-Verlag Berlin Heidelberg 1998

250

Luc Devroye

particles in the i-th generation, with Zo = 1. Only one of two possible situations can occur: either the population survives forever (Zi > 0 for all i), or it becomes extinct after a finite time. To analyze the Galton-Watson process it is convenient to use the RGF (the reproduction generating function), or simply generating function 00

f(s)

= LPkSk = E(SZl)

, s E [0,1] .

k=O

This is a function of s that contains exactly the same information as the vector (PO,Pl. .. .). It is strictly convex (if PI i= 1) and increases from Po at s = 0 to 1 at s = 1. Different RGF'S define different Galton-Watson branching processes. Intuitively, it should be clear that a population explodes if the expected number of children per particle is greater than one, and that it is bound to shrink if it is less than one. An important paramet