Random Graphs as Null Models

In the last chapter, a qualitative comparison of various real-world structures with classic random graph models revealed that complex networks are non-random in many aspects. This chapter focuses on the question of how to quantify the statistical signific

  • PDF / 899,691 Bytes
  • 32 Pages / 439.37 x 666.142 pts Page_size
  • 62 Downloads / 261 Views

DOWNLOAD

REPORT


Random Graphs as Null Models

Abstract In the last chapter, a qualitative comparison of various real-world structures with classic random graph models revealed that complex networks are nonrandom in many aspects. This chapter focuses on the question of how to quantify the statistical significance of an observed network structure with respect to a given random graph model. The chapter starts with a discussion of the statistical significance of a given percentage of reciprocal edges in a directed graph and then introduces a new random graph model in which the degree sequence(s) are maintained. Finally, the notion of network motifs is introduced.

7.1 Introduction This chapter is concerned with random graph models as null models to evaluate the statistical significance of structural features. It starts with two examples of how to evaluate the statistical significance, namely that of reciprocity and co-occurrence of two nodes, in Sect. 7.2. It becomes evident, that a new type of random graph model is needed, which (at least approximately) maintains a given degree sequence (Sect. 7.3). Then I explain the general motivation behind comparing a real-world network with a random graph model, which is based on ideas in statistical physics (Sect. 7.4). Finally, some applications of the general approach are discussed in Sect. 7.5, especially the idea of identifying network motifs. The chapter concludes with a summary (Sect. 7.6), further reading (Sect. 7.7), and some exercises (Sect. 7.8).

7.2 Assessing the Significance of a Structural Feature The small-world phenomenon as introduced in Chap. 6 was based on the finding that most real-world networks show an average distance which is comparable to that of the corresponding random graph1 while their average clustering coefficient 1 As

defined earlier, a corresponding random graph is one with the same or expectedly the same number of nodes and edges as a given graph G.

© Springer-Verlag GmbH Austria 2016 K.A. Zweig, Network Analysis Literacy, Lecture Notes in Social Networks, DOI 10.1007/978-3-7091-0741-6_7

183

184

7 Random Graphs as Null Models

is much higher than that of the corresponding random graph (Sect. 6.4). It is clear that the idea can be easily generalized to any kind of measure: given the value of a structural measure in a real-world network it can be compared with the expected value of that measure in a graph in which the network’s structure is randomized. I.e., given a structural measure M : S → R which is defined for all graphs of a given set S, and a random graph model G, the expected value of E[M|G] over all graphs in G is compared with M(G), G ∈ S. Such an approach is part of an exploratory data analysis in which the research question is not necessarily stated beforehand, but in which any structure is tested whether it deviates from the structure of some pre-defined model. The basic motivation for this comparison is well captured in the following quote by Persi Diaconis, speaking about exploratory data analysis (EDA) in general: At one extreme, we can view