Lattice-based methods for regression and density estimation on complicated multidimensional regions

  • PDF / 2,051,592 Bytes
  • 19 Pages / 439.37 x 666.142 pts Page_size
  • 86 Downloads / 173 Views

DOWNLOAD

REPORT


Lattice-based methods for regression and density estimation on complicated multidimensional regions Ronald P. Barry1 · Julie McIntyre1 Received: 14 October 2019 / Revised: 17 May 2020 / Accepted: 15 July 2020 / Published online: 11 August 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract This paper illustrates the use of diffusion kernels to estimate smooth density and regression functions defined on highly complex domains. We generalize the twodimensional lattice-based estimators of Barry and McIntyre (2011) and McIntyre and Barry (2018) to estimate any function defined on a domain that may be embedded in Rd , d ≥ 1. Examples include function estimation on the surface of a sphere, a sphere with boundaries and holes, a sphere over multiple time periods, a linear network, the surface of cylinder, a three-dimensional volume with boundaries, and a union of oneand two-dimensional subregions. Keywords Diffusion · Kernel estimation · Nonparametric smoothing

1 Introduction The use of diffusion kernels for nonparametric estimation of spatial functions is an active area of research (Botev et al. 2010; Barry and McIntyre 2011; McSwiggan et al. 2017; McIntyre and Barry 2018). The approach has been studied primarily in the context of estimating two-dimensional spatial functions defined on domains with irregular bounaries and holes. Standard spatial estimators that base similarity on Euclidean distance ignore these domain characteristics. These estimators tend to

Handling Editor: Pierre Dutilleul. Electronic supplementary material The online version of this article (https://doi.org/10.1007/s10651020-00459-z) contains supplementary material, which is available to authorized users.

B

Ronald P. Barry [email protected] Julie McIntyre [email protected]

1

Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, AK 99775, USA

123

572

Environmental and Ecological Statistics (2020) 27:571–589

smooth inappropriately over boundaries representing important geographical features like lakes or peninsulas, which may imply abrupt changes or discontinuities in the underlying function of interest. Biases in such estimators are well documented (e.g., Ramsay 2002; Wood et al. 2008). Recent papers have presented two-dimensional density (Barry and McIntyre 2011) and regression (McIntyre and Barry 2018) estimators using kernels based on random walks on a lattice of points contained in the region of interest. We refer to these estimators as lattice-based estimators, and code for their implementation in the R programming language (R Core Team 2018) is available in the package latticeDensity (Barry 2012). In this paper we illustrate, through a number of real and simulated examples, that these estimators can be applied to much more general situations. Indeed, the technique generalizes straightforwardly to estimation of spatial functions defined on networks, manifolds and solids, including those with irregular boundaries and holes. Moreover, the domains of interest need not be spa