Sup-Sums Principles for F -Divergence and a New Definition for t -Entropy

  • PDF / 379,923 Bytes
  • 20 Pages / 439.37 x 666.142 pts Page_size
  • 48 Downloads / 143 Views

DOWNLOAD

REPORT


Sup-Sums Principles for F-Divergence and a New Definition for t-Entropy V. I. Bakhtin1,2

· A. V. Lebedev2,3

Received: 29 February 2020 / Revised: 14 August 2020 © The Author(s) 2020

Abstract The article presents new sup-sums principles for integral F-divergence for arbitrary convex functions F on the whole real axis and arbitrary (not necessarily positive and normalized) measures. Among applications of these results, we work out a new ‘integral’ definition for t-entropy explicitly establishing its relation to Kullback–Leibler divergence. Keywords F-divergence · Kullback–Leibler divergence · Sup-sums principle · Partition of unity · t-entropy Mathematics Subject Classification (2020) 26D15 · 37A35 · 47B37 · 62H20 · 94A17

1 Introduction The notion of F-divergence was introduced and originally studied in analysis of probability distributions by [2,14,19]. It is defined in the following way. Let Q and P be two probability distributions over a space  such that Q is absolutely continuous with respect to P. Then, for a convex function F : R+ → R such that F(1) = 0, the F-divergence D F (QP) of Q from P is defined as

B

V. I. Bakhtin [email protected] A. V. Lebedev [email protected]

1

John Paul II Catholic University of Lublin, Lublin, Poland

2

Belarusian State University, Minsk, Belarus

3

University of Bialystok, Białystok, Poland

123

Journal of Theoretical Probability



 D F (QP) :=



F

dQ dP

 d P,

(1)

where dQ/d P is the Radon–Nikodym derivative of Q with respect to P. Since its introduction, the F-divergence has been intensively exploited and analysed due to the fact that by taking appropriate functions F one arrives here at numerous important divergences such as Kullback–Leibler divergence, Hellinger distance, Pearson χ 2 -divergence, etc. A comprehensive analysis of F-divergence was worked out by Liese and Vajda in [16] where a sup-sums principle for space partitions was established as well [16, Theorem 16]. In fact, formula (1) can be extended to arbitrary real-valued measures Q. Moreover, for such measures, Q, the value D F (QP) possesses a substantial statistical meaning. In [13,20,23], it was shown that the value e−n D F (QP) determines the asymptotics for conditional probabilities of large deviations for a certain family of weighted empirical measures that are close to Q, where F is the rate function for large deviations for the sequence of random weights. In [12], the Fdivergences for real-valued measures Q were applied for parametric estimation and testing. The object of the present article is a general F-divergence associated with an arbitrary convex function F that is defined on the whole real axis and can take infinite values and arbitrary real-valued (not necessarily positive, normalized, and absolutely continuous) measures. For this F-divergence, we derive a number of new sup-sums principles exploiting as measurable so also continuous partitions of unity (Theorems 10, 12, and 14 of the article). In particular, they disclose the passage procedure from the F-divergence on a finite pha