Rankboost $$+$$ + : an improvement to Rankboost
- PDF / 746,532 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 93 Downloads / 150 Views
R ANKBOOST+: an improvement to R ANKBOOST Harold Connamacher1
· Nikil Pancha1 · Rui Liu1 · Soumya Ray1
Received: 28 January 2018 / Revised: 20 June 2019 / Accepted: 5 July 2019 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019
Abstract Rankboost is a well-known algorithm that iteratively creates and aggregates a collection of “weak rankers” to build an effective ranking procedure. Initial work on Rankboost proposed two variants. One variant, that we call Rb-d and which is designed for the scenario where all weak rankers have the binary range {0, 1}, has good theoretical properties, but does not perform well in practice. The other, that we call Rb-c, has good empirical behavior and is the recommended variation for this binary weak ranker scenario but lacks a theoretical grounding. In this paper, we rectify this situation by proposing an improved Rankboost algorithm for the binary weak ranker scenario that we call Rankboost+. We prove that this approach is theoretically sound and also show empirically that it outperforms both Rankboost variants in practice. Further, the theory behind Rankboost+ helps us to explain why Rb-d may not perform well in practice, and why Rb-c is better behaved in the binary weak ranker scenario, as has been observed in prior work. Keywords Ranking · Boosting · Ensemble methods · Rankboost
1 Introduction In a ranking task, a learner is given a set of preferences, often organized pairwise, over instances. For instance, in a movie recommendation task, the learner might be told that Alice likes “2001: A Space Odyssey” better than “Interstellar,” and similar facts. The learner needs to produce a function that then correctly ranks novel instances. In this example, this function
Editors: Jesse Davis, Elisa Fromont, Derek Greene, and Bjorn Bringmann.
B
Harold Connamacher [email protected] Nikil Pancha [email protected] Rui Liu [email protected] Soumya Ray [email protected]
1
Department of Electrical Engineering and Computer Science, Case Western Reserve University, 10900 Euclid Ave, Cleveland, OH 44106, USA
123
Machine Learning
would be expected to rank new movies in Alice’s order of preference. Ranking has many applications in areas such as drug design (Agarwal et al. 2010), information retrieval (Nuray and Can 2006) and of course recommendation systems (Freund et al. 2003). An elegant way to solve the ranking task is through a “boosting” algorithm. In classification, algorithms such as Adaboost (Freund and Schapire 1995) iteratively learn and aggregate a collection of “base learners” into a solution that is very accurate. Base learners are only required to satisfy a “weak learning” criterion, which means they only need to be slightly better than chance on the learning task. This is attractive for at least two reasons. First, for many domains, finding a classifier that satisfies weak learning can be easier than finding one that is very accurate. Second, boosting algorithms for classification are known to possess des
Data Loading...