Multiagent -Learning for Aloha-Like Spectrum Access in Cognitive Radio Systems

  • PDF / 803,428 Bytes
  • 15 Pages / 600.05 x 792 pts Page_size
  • 104 Downloads / 162 Views

DOWNLOAD

REPORT


Research Article Multiagent Q-Learning for Aloha-Like Spectrum Access in Cognitive Radio Systems Husheng Li Department of Electrical Engineering and Computer Science, The University of Tennessee, Knoxville, TN 37996, USA Correspondence should be addressed to Husheng Li, [email protected] Received 31 December 2009; Accepted 18 April 2010 Academic Editor: Vincent Lau Copyright © 2010 Husheng Li. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An Aloha-like spectrum access scheme without negotiation is considered for multiuser and multichannel cognitive radio systems. To avoid collisions incurred by the lack of coordination, each secondary user learns how to select channels according to its experience. Multiagent reinforcement leaning (MARL) is applied for the secondary users to learn good strategies of channel selection. Specifically, the framework of Q-learning is extended from single user case to multiagent case by considering other secondary users as a part of the environment. The dynamics of the Q-learning are illustrated using a Metrick-Polak plot, which shows the traces of Q-values in the two-user case. For both complete and partial observation cases, rigorous proofs of the convergence of multiagent Q-learning without communications, under certain conditions, are provided using the Robins-Monro algorithm and contraction mapping, respectively. The learning performance (speed and gain in utility) is evaluated by numerical simulations.

1. Introduction In recent years, cognitive radio has attracted intensive studies in the community of wireless communications. It allows users without license (called secondary users) to access licensed frequency bands when licensed users (called primary users) are not present. Therefore, the cognitive radio technique can substantially alleviate the problem of underutilization of frequency spectrum [1, 2]. The following two problems are key to cognitive radio systems. (i) Resource mining, that is, how to detect the available resource (the frequency bands that are not being used by primary users); usually it is done by spectrum sensing. (ii) Resource allocation, that is, how to allocate the available resource to different secondary users. Substantial work has been done for the resource mining problem. Many signal processing techniques have been applied to sense the frequency spectrum [3], for example, cyclostationary feature [4], quickest change detection [5],

and collaborative spectrum sensing [6]. Meanwhile, a significant amount of research has been conducted for the resource allocation in cognitive radio systems [7, 8]. Typically, it is assumed that secondary users exchange information about available spectrum resources and then negotiate on the resource allocation according to their own requirements of traffic (since the same resource cannot be shared by different secondary users if orthogonal transmission is assume