Convex Factorization Machines

Factorization machines are a generic framework which allows to mimic many factorization models simply by feature engineering. In this way, they combine the high predictive accuracy of factorization models with the flexibility of feature engineering. Unfor

PDF / 584,092 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
1 Downloads / 248 Views

DOWNLOAD

REPORT

Abstract. Factorization machines are a generic framework which allows to mimic many factorization models simply by feature engineering. In this way, they combine the high predictive accuracy of factorization models with the ﬂexibility of feature engineering. Unfortunately, factorization machines involve a non-convex optimization problem and are thus subject to bad local minima. In this paper, we propose a convex formulation of factorization machines based on the nuclear norm. Our formulation imposes fewer restrictions on the learned model and is thus more general than the original formulation. To solve the corresponding optimization problem, we present an eﬃcient globally-convergent two-block coordinate descent algorithm. Empirically, we demonstrate that our approach achieves comparable or better predictive accuracy than the original factorization machines on 4 recommendation tasks and scales to datasets with 10 million samples. Keywords: Factorization machines mender systems · Nuclear norm

1

·

Feature interactions

·

Recom-

Introduction

Factorization machines [12] [13] are a generic framework which allows to mimic many factorization models simply by feature engineering. Similarly to linear models, factorization machines learn a feature weight vector w ∈ Rd , where d is the number of features. However, factorization machines also learn a pairwise feature interaction weight matrix Z ∈ Rd×d . Given a feature vector x ∈ Rd , factorization machines use w and Z to predict a target y ∈ R. The main advantage of factorization machines is that they learn the feature interaction weight matrix in factorized form, Z = V V T , where V ∈ Rd×k and k d is a rank hyper-parameter. This reduces overﬁtting, since the number of parameters to estimate is reduced from d2 to kd, and allows to compute predictions eﬃciently. Although they can be used for any supervised learning task such as classiﬁcation and regression, factorization machines are especially useful for recommender systems. As shown in [12][13], factorization machines can mimic many existing factorization models just by choosing an appropriate feature representation for x. Examples include standard matrix factorization, SVD++ [8], timeSVD++[9] and PITF (pairwise interaction tensor factorization) [16]. Moreover, it is easy to incorporate auxiliary features such as c Springer International Publishing Switzerland 2015 A. Appice et al. (Eds.): ECML PKDD 2015, Part II, LNAI 9285, pp. 19–35, 2015. DOI: 10.1007/978-3-319-23525-7 2

20

M. Blondel et al.

user and item attributes, contextual information [15] and cross-domain feedback [10]. In [14], it was shown that factorization machines achieve predictive accuracy as good as the best specialized models on the Netﬂix and KDDcup 2012 challenges. In short, factorization machines are a generic framework which combines the high predictive accuracy of factorization models with the ﬂexibility of feature engineering. Unfortunately, factorization machines have two main drawbacks. First, they involve a non-convex optimization prob

Data Loading...

Convex Factorization Machines

Recommend Documents

FER based on the improved convex nonnegative matrix factorization feature

Neural attention model for recommendation based on factorization machines

Malware Classifications Based on Static-Dynamic Features and Factorization Machines

Convex Non-negative Spherical Factorization of Multi-Shell Diffusion-Weighted Images

Order-Aware Embedding Non-sampling Factorization Machines for Context-Aware Recommendation

LU Factorization

Cholesky Factorization

Cholesky Factorization

QR Factorization

L-Convex Functions and M-Convex Functions

LU -Factorization Versus Wiener-Hopf Factorization for Markov Chains

Matrix factorization of large scale data using multistage matrix factorization