Unsupervised Qualitative Scoring for Binary Item Features
- PDF / 1,814,440 Bytes
- 14 Pages / 595.276 x 790.866 pts Page_size
- 20 Downloads / 160 Views
Unsupervised Qualitative Scoring for Binary Item Features Koji Ichikawa1 · Hiroshi Tamano1 Received: 14 March 2020 / Revised: 14 May 2020 / Accepted: 25 May 2020 © The Author(s) 2020
Abstract Binary features, such as categories, keywords, or tags, are widely used to describe product properties. However, these features are incomplete in that they do not contain several aspects of numerical information. The qualitative score of tags is widely used to describe which product is better in terms of the given property. For example, in a restaurant navigation site, properties such as mood, dishes, and location are given in the form of numerical values, representing the goodness of each aspect. In this paper, we propose a novel approach to estimate the qualitative score from the binary features of products. Based on a natural assumption that an item with a better property is more popular among users who prefer that property, in short, “experts know best,” we introduce both discriminative and generative models with which user preferences and item qualitative scores are inferred from user--item interactions. We constrain the space of the item qualitative score by item binary features so that the score of each item and tag can only have nonzero values when the item has the corresponding tag. This approach contributes to resolving the following difficulties: (1) no supervised data for the score estimation, (2) implicit user purpose, and (3) irrelevant tag contamination. We evaluate our models by using two artificial datasets and two real-world datasets of movie and book ratings. In the experiment, we evaluate the performances of our model under sparse transaction and noisy tag settings by using two artificial datasets. We also evaluate our models’ resolution for irrelevant tags using the real-world dataset of movie ratings and observe that our models outperform a baseline model. Finally, tag rankings obtained from the real-world datasets are compared with a baseline model. Keywords Label enhancement · Unsupervised learning · Collaborative filtering · Topic model
1 Introduction Keywords, tags, and categories are widely used to describe product properties. Most e-commerce services, such as Amazon, Alibaba, or eBay, use categories and tags for item filtering. In social bookmarking and recommendation services such as Delicious, Last.fm, and MovieLens, tags are sometimes annotated by users. Even without explicit tags, we unconsciously infer item tags from side information, i.e., product names, explanation texts, reviews, and package designs. However, such binary expression is incomplete because it does not contain three aspects of numerical information: quantity, relevance, and quality. Quantity represents the * Koji Ichikawa [email protected] Hiroshi Tamano h‑[email protected] 1
NEC Corporation, Tokyo, Japan
strength of a tag with a certain unit. For example, tags, such as low-calorie, light, and hot, lose quantitative information such as 50 kcal, 100 g, and four degrees out of five. With relevance, on the other hand, one
Data Loading...