Comparison with Recommendation Algorithm Based on Random Forest Model

Product recommendation based on user behavior is a hot research topic In the Internet era in the same data set, the features that the results of the various classifications are a greater difference were handled with random forest model. This paper compare

PDF / 485,468 Bytes
8 Pages / 439.37 x 666.142 pts Page_size
64 Downloads / 265 Views

DOWNLOAD

REPORT

Key Laboratory of Information System Security of Ministry of Education, TNLIST, School of Software, Tsinghua University, Beijing 100084, China 2 College of Computer Science and Technology, Jilin University, Changchun 130012, China [email protected]

Abstract. Product recommendation based on user behavior is a hot research topic In the Internet era in the same data set, the features that the results of the various classiﬁcations are a greater difference were handled with random forest model. This paper compares the mainstream classiﬁcation algorithm C4.5 and CART and analyzes 578,906,480 user behavior records on the results of actual transaction in Alibaba. The results show that CART decision tree algorithm is more suitable for large e-commerce data mining. Keywords: User behavior CART

Random forest model

Decision tree

C4.5

1 Introduction User implicit demand excavated from the mass of information on user behaviors is essential for service providers. Currently, the recommended system [1] has been preliminarily applied in business, but how to construct a highly efﬁcient and intelligent recommendation algorithm is still a hot topic. Random Forests model that a classiﬁcation prediction model [2] is proposed by Leo Breiman, it has many advantages, such as learning faster, less parameters and fault tolerance, since it was proposed in many ﬁelds received applications. Guo Yingjie et al. used random forest classiﬁcation to identiﬁes plant resistance gene [3]; Li Jiangeng et al. analyze gene pathways of cancer microarray data based on random forest [4] and Fang Kuangnan predicts fund yields direction used random forests model [5]. In this paper, the dataset is massive amounts of user behavior in the Alibaba website real deal. We deﬁned user behavior attribute set and compared with classiﬁcation algorithm C4.5 and CART based on random forest model to provide evidence for better user recommendation.

© Springer Nature Singapore Pte Ltd. 2017 J.J. (Jong Hyuk) Park et al. (eds.), Advances in Computer Science and Ubiquitous Computing, Lecture Notes in Electrical Engineering 421, DOI 10.1007/978-981-10-3023-9_72

464

Y. Jiang et al.

2 Basic Theory 2.1

Random Forests Model

Random Forests is classiﬁer made more decision independent trees [6, 7]. The generation of decision tree is generally controlled by the property division and pruning, but when a large number of features, it may be over-ﬁtting problems. Random forests use boosting [8, 9] resampling method to extract plurality of samples from the original data set, and to construct the decision tree for each sample, through the plural the of decision tree, it can forecast the ﬁnal prediction results (Fig. 1).

Fig. 1. Random forests model

2.2

C4.5 Algorithm

C4.5 algorithm [10] starting from the root node assigned the best properties. The value of each attribute will generate the corresponding branch, and generate new nodes on each branch. Best attribute selection criteria is based on the deﬁnition of information entropy gain ratio to select test properties

Data Loading...

Comparison with Recommendation Algorithm Based on Random Forest Model

Recommend Documents

The multimedia recommendation algorithm based on probability graphical model

On Random-Forest-Based Prediction Intervals

Expertise-aware news feed updates recommendation: a random forest approach

Pathological lung segmentation based on random forest combined with deep model and multi-scale superpixels

Mixed Recommendation Algorithm Based on Commodity Gene and Genetic Algorithm

Plant Leaf Recognition and Classification Based on the Whale Optimization Algorithm (WOA) and Random Forest (RF)

Improved Random Forest Algorithm Based on Adaptive Step Size Artificial Bee Colony Optimization

Rule Generation of Cataract Patient Data Using Random Forest Algorithm

Road accident prediction and model interpretation using a hybrid K-means and random forest algorithm approach

A collaborative filtering recommendation algorithm based on normalization approach

Adaptive Recommendation Algorithm Based on the Bayesian-Network

Employment Service System Based on Hybrid Recommendation Algorithm