Deep learning feature selection to unhide demographic recommender systems factors

  • PDF / 3,368,231 Bytes
  • 18 Pages / 595.276 x 790.866 pts Page_size
  • 27 Downloads / 215 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

Deep learning feature selection to unhide demographic recommender systems factors J. Bobadilla1



A´. Gonza´lez-Prieto1



F. Ortega1



R. Lara-Cabrera1

Received: 5 June 2020 / Accepted: 27 October 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Extracting demographic features from hidden factors is an innovative concept that provides multiple and relevant applications. The matrix factorization model generates factors which do not incorporate semantic knowledge. Extracting the existing nonlinear relations between hidden factors and demographic information is a challenging task that can not be adequately addressed by means of statistical methods or using simple machine learning algorithms. This paper provides a deep learning-based method: DeepUnHide, able to extract demographic information from the users and items factors in collaborative filtering recommender systems. The core of the proposed method is the gradient-based localization used in the image processing literature to highlight the representative areas of each classification class. Validation experiments make use of two public datasets and current baselines. The results show the superiority of DeepUnHide to make feature selection and demographic classification, compared to the state-of-art of feature selection methods. Relevant and direct applications include recommendations explanation, fairness in collaborative filtering and recommendation to groups of users. Keywords Feature selection  Collaborative filtering  Demographic information  Matrix factorization  Gradient-based localization  Deep learning

1 Introduction Recommender system (RS) [37, 50] are playing an important role in our society: they provide useful information to the users by recommending highly demanded products and services. Remarkable examples of RS are: Amazon, Netflix, TripAdvisor and Spotify. RS are implemented by means of several filtering strategies, mainly the collaborative [37, 50], content [59], demographic [1], context [53] and social [47] ones. Most of the commercial ´ . Gonza´lez-Prieto & A [email protected] J. Bobadilla [email protected] F. Ortega [email protected] R. Lara-Cabrera [email protected] 1

Dpto. Sistemas Informa´ticos, ETSI Sistemas Informa´ticos, Universidad Polite´cnica de Madrid, Madrid, Spain

RS are based on hybrid models that combine collaborative filtering (CF) with some other filtering approaches. In the early ages of RS research, CF was implemented using the K-nearest neighbours (KNN) algorithm [10]: it is easy to understand, to implement and to analyse, since it can be considered as a white-box method. This approach has also been updated and improved in the recent years with promising approaches like hybrid methods [3] or adding information theoretic quality measures [27]. Nevertheless, the KNN main drawbacks are its lack of scalability and its poor accuracy. Due to the exposed KNN drawbacks, this memory-based algorithm has been replac