Benchmarking
This chapter describes the concepts involved in the process of benchmarking of recommendation systems. Benchmarking of recommendation systems is used to ensure the quality of a research system or production system in comparison to other systems, whether a
- PDF / 603,113 Bytes
- 26 Pages / 439.36 x 666.15 pts Page_size
- 86 Downloads / 182 Views
Benchmarking A Methodology for Ensuring the Relative Quality of Recommendation Systems in Software Engineering Alan Said, Domonkos Tikk, and Paolo Cremonesi
Abstract This chapter describes the concepts involved in the process of benchmarking of recommendation systems. Benchmarking of recommendation systems is used to ensure the quality of a research system or production system in comparison to other systems, whether algorithmically, infrastructurally, or according to any sought-after quality. Specifically, the chapter presents evaluation of recommendation systems according to recommendation accuracy, technical constraints, and business values in the context of a multi-dimensional benchmarking and evaluation model encompassing any number of qualities into a final comparable metric. The focus is put on quality measures related to recommendation accuracy, technical factors, and business values. The chapter first introduces concepts related to evaluation and benchmarking of recommendation systems, continues with an overview of the current state of the art, then presents the multi-dimensional approach in detail. The chapter concludes with a brief discussion of the introduced concepts and a summary.
A. Said () Centrum Wiskunde & Informatica, Amsterdam, The Netherlands e-mail: [email protected] D. Tikk Gravity R&D, Budapest, Hungary Óbuda University, Budapest, Hungary e-mail: [email protected]; [email protected] P. Cremonesi Politecnico di Milano, Milano, Italy e-mail: [email protected] M.P. Robillard et al. (eds.), Recommendation Systems in Software Engineering, DOI 10.1007/978-3-642-45135-5__11, © Springer-Verlag Berlin Heidelberg 2014
275
276
A. Said et al.
11.1 Introduction Benchmarking is a structural approach to quality engineering and management [7];
essentially it is a comparison process aimed at finding the best practice for a given well-specified problem. The concept of benchmarking originated in optimizing business processes by investigating and analyzing industry standards, comparing them to the one applied in the investigator’s own organization, and creating an implementation plan with predefined goals and objectives to improve the quality and performance of the evaluated process. In the last few decades, benchmarking has also become very popular in scientific research and software engineering, driven by the need to identify best-in-class approaches or algorithms for scientific problems, and to facilitate various stages of the software development lifecycle, including automated code-testing [18]. The process of traditional benchmarking is built up from the following steps: (1) design and target specification, (2) data collection, (3) evaluation and analysis, and (4) implementation of improvements. Scientific benchmarking, on the other hand, mainly focuses on providing a means for comparison and exploration of novel ideas on a dataset collected for the given purpose,1 and puts less emphasis on the implementation of the improvements in an industrial environment. Due to their or
Data Loading...