Two directional Laplacian pyramids with application to data imputation

PDF / 2,629,260 Bytes
24 Pages / 439.642 x 666.49 pts Page_size
39 Downloads / 197 Views

Two directional Laplacian pyramids with application to data imputation Neta Rabin1

· Dalia Fishelov1

Received: 26 September 2018 / Accepted: 3 April 2019 / © Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract Modeling and analyzing high-dimensional data has become a common task in various fields and applications. Often, it is of interest to learn a function that is defined on the data and then to extend its values to newly arrived data points. The Laplacian pyramids approach invokes kernels of decreasing widths to learns a given dataset and a function defined over it in a multi-scale manner. Extension of the function to new values may then be easily performed. In this work, we extend the Laplacian pyramids technique to model the data by considering two-directional connections. In practice, kernels of decreasing widths are constructed on the row-space and on the column space of the given dataset and in each step of the algorithm the data is approximated by considering the connections in both directions. Moreover, the method does not require solving a minimization problem as other common imputation techniques do, thus avoids the risk of a non-converging process. The method presented in this paper is general and may be adapted to imputation tasks. The numerical results demonstrate the ability of the algorithm to deal with a large number of missing data values. In addition, in most cases, the proposed method generates lower errors compared to existing imputation methods applied to benchmark dataset. Keywords Laplacian pyramids · RNA sequencing data · Two-sided LP scheme · Imputation Mathematics Subject Classiﬁcation (2010) 68T30

Communicated by: Pavel Solin Neta Rabin

[email protected]; [email protected] Dalia Fishelov [email protected]; [email protected] 1

Afeka - Tel Aviv Academic College of Engineering, 38 Bnei Efraim St., Tel Aviv, Israel

N. Rabin, D. Fishelov

1 Introduction Modeling and analyzing high-dimensional data has become a common task in various fields and applications. Kernel-based machine learning methods are capable of generating compact models that capture underling important features of the complex dataset. Typically, given a dataset X of size M × N, a kernel is constructed based on the rows of X. This kernel captures the pairwise distances between the rows of X. In classification algorithms, such as SVM (Support Vector Machines), kernels are used for finding non-linear separations between data classes. In addition, non-linear dimensionality reduction algorithms, such as diffusion maps [6], utilize spectral decomposition of normalized kernels for embedding high-dimensional data. Recent work [7] proposed dual geometry approaches that embed the dataset X in a low-dimensional space using non-linear dimensionality reduction techniques that are consequently applied to rows and columns of X. Another method that utilizes kernels, which is extended in this paper, is Laplacian pyramids [11, 28]. Laplacian pyramids is a multi-scale algorithm for learning functions over s

Data Loading...

Two directional Laplacian pyramids with application to data imputation

Recommend Documents

Data Imputation

Directional Data

Clustering Imputation for Air Pollution Data

SICE: an improved missing data imputation technique

Ensemble Learning for Heterogeneous Missing Data Imputation

The Big Data Approach Using Bio-Inspired Algorithms: Data Imputation

Multiple Imputation

Pyramids on the River

Directional PVO for reversible data hiding scheme with image interpolation

Missing Value Imputation with MERCS: A Faster Alternative to MissForest

Iterative Imputation of Missing Data Using Auto-Encoder Dynamics

Variable selection techniques after multiple imputation in high-dimensional data