Sample size calculations for the experimental comparison of multiple algorithms on multiple problem instances

PDF / 14,816,466 Bytes
33 Pages / 439.37 x 666.142 pts Page_size
5 Downloads / 250 Views

Sample size calculations for the experimental comparison of multiple algorithms on multiple problem instances Felipe Campelo1,2

· Elizabeth F. Wanner1,3

Received: 5 August 2019 / Revised: 13 May 2020 / Accepted: 23 July 2020 © The Author(s) 2020

Abstract This work presents a statistically principled method for estimating the required number of instances in the experimental comparison of multiple algorithms on a given problem class of interest. This approach generalises earlier results by allowing researchers to design experiments based on the desired best, worst, mean or median-case statistical power to detect differences between algorithms larger than a certain threshold. Holm’s step-down procedure is used to maintain the overall significance level controlled at desired levels, without resulting in overly conservative experiments. This paper also presents an approach for sampling each algorithm on each instance, based on optimal sample size ratios that minimise the total required number of runs subject to a desired accuracy in the estimation of paired differences. A case study investigating the effect of 21 variants of a custom-tailored Simulated Annealing for a class of scheduling problems is used to illustrate the application of the proposed methods for sample size calculations in the experimental comparison of algorithms. Keywords Experimental comparison of algorithms · Statistical methods · Sample size estimation · Iterative sampling · Multiple hypotheses testing

F. Campelo worked under grants from Brazilian agencies FAPEMIG (APQ-01099-16) and CNPq (404988/2016-4). E. F. Wanner has been funded by The Leverhulme Trust through Research Fellowship RF-2018-527/9.

B

Felipe Campelo [email protected] Elizabeth F. Wanner [email protected]

1

College of Engineering and Physical Sciences, Aston University, Birmingham B4 7ET, UK

2

Department of Electrical Engineering, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil

3

Department of Computer Engineering, CEFET-MG, Belo Horizonte, Brazil

123

F. Campelo, E. F. Wanner

1 Introduction Experimental comparison of algorithms has long been recognised as an essential aspect of research on meta-heuristics (Barr et al. 1995; Hooker 1996; McGeoch 1996; Coffin and Saltzman 2000). Despite well-earned criticisms of some disconnect between theory and experimentation (Hooker 1994; Chimani and Klein 2010) and a few methodological issues that still require adequate attention (Eiben and Jelasity 2002; Bartz-Beielstein 2015), experimental evaluation remains a central aspect and a major component of the development and understanding of meta-heuristics in general. Experimental comparison of algorithms involves the use of performance indicators, with the choice of indicator being closely related to the problem domain as well as to the question of interest being investigated by the experimenter. For optimisation algorithms, performance can be measured in terms, e.g., of final objective function value, best-of-iteration fitness, first hitting time, time-to-converge

Data Loading...

Sample size calculations for the experimental comparison of multiple algorithms on multiple problem instances

Recommend Documents

Sample Size Determination in Clinical Trials with Multiple Endpoints

Comparison of grouping algorithms to increase the sample size for statistical process control

Parallel-Machine Scheduling Problem: An Experimental Study of Instances Difficulty and Algorithms Performance

Sample Size Algorithms in Clinical Trials

Size-Hiding Computation for Multiple Parties

Sensitive Instances of the Cutting Stock Problem

Accounting for multiple effects and the problem of small sample sizes in osteology: a case study focussing on entheseal

Nonparametric Sample Size Estimation for Sensitivity and Specificity with Multiple Observations per Subject

On the Multiple Steiner Traveling Salesman Problem with Order Constraints

Comparison of Bioinspired Algorithms Applied to the Timetabling Problem

Detecting Phishing SMS Based on Multiple Correlation Algorithms

Sample Size