Identifying Important Nodes in Bio-Molecular Networks

Many issues in bio-molecular networks can be boiled down to the identification of important nodes or gene prioritization. Various measures have been proposed to characterize the importance of nodes in complex networks, such as the degree, betweenness, k-s

  • PDF / 3,327,591 Bytes
  • 82 Pages / 439.36 x 666.15 pts Page_size
  • 60 Downloads / 245 Views

DOWNLOAD

REPORT


Identifying Important Nodes in Bio-Molecular Networks

Abstract Many issues in bio-molecular networks can be boiled down to the identification of important nodes or gene prioritization. Various measures have been proposed to characterize the importance of nodes in complex networks, such as the degree, betweenness, k-shell, clustering coefficient, closeness, semi-local centrality, PageRank, and LeaderRank. Different measures consider different aspects of complex networks. In this chapter, based on network motifs and principal component analysis, we introduced a new measure to characterize node importance in directed biological networks. Investigations on five real-world biological networks indicate that the proposed method can robustly identify actually important nodes in different networks. Further using the principal component analysis technique to integrate some existing centrality measures, we introduced a new integrative measure to find the structurally dominant proteins in protein interaction networks. Finally, the recently proposed SpectralRank and the weighted SpectralRank will be introduced, which can be used in various kinds of networks.

7.1 Backgrounds Complex networks theory and its applications have been popular topics in recent years [1–8]. Many real-world systems can be described by complex networks and investigated through complex networks theory, such as social systems, biological systems. GRNs, signal transduction networks, neural networks, PPI networks, metabolic networks are typical biological networks, which have been extensively investigated [9–15]. Complex networks consist of nodes and edges. An edge denotes the interaction between two nodes, which can be directed or undirected. Many biological networks are directed ones. For example, in GRNs, nodes represent genes or TFs, edges represent the interactions between TFs and the regulated genes, or between TFs. Over the last decades, identification of important nodes in complex networks has been an intriguing topic [16–33]. For example, in social networks, provided that one knows which nodes are the most important ones, one can control these © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 J. Lü, P. Wang, Modeling and Analysis of Bio-molecular Networks, https://doi.org/10.1007/978-981-15-9144-0_7

315

316

7 Identifying Important Nodes in Bio-Molecular Networks

nodes in priority to prevent the spread of infectious diseases [16]. However, it is still a challenge to determine which nodes are important in a complex network. Traditionally, degree is frequently used to characterize the importance of a node [1, 2, 7, 8, 16, 34, 35]. The other indexes include the betweenness [19], closeness [1], k-shell [7], principal component centrality [17] based on adjacency matrix of the network, semi-local centrality [20], motif centrality [25, 27–31], PageRank (PR) [21], and others therein [36]. For undirected networks, some researchers believe that the most connected nodes are the most influ