Difficult first strategy GP: an inexpensive sampling technique to improve the performance of genetic programming

PDF / 1,392,387 Bytes
13 Pages / 595.276 x 790.866 pts Page_size
34 Downloads / 184 Views

RESEARCH PAPER

Difficult first strategy GP: an inexpensive sampling technique to improve the performance of genetic programming Muhammad Quamber Ali1 · Hammad Majeed1 Received: 21 January 2019 / Revised: 16 December 2019 / Accepted: 25 January 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Genetic programming (GP) is a top performer in solving classification and clustering problems, in general and symbolic regression problems, in particular. GP has produced impressive results and has outperformed human generated results for 76 different problems taken from 22 different fields. There remain a number of significant open issues despite its impressive results. Among them are high computational cost, premature convergence and high error rate. These issues must be addressed for GP to realise its full potential. In this paper a simple and cost effective technique called Difficult First Strategy-GP (DFSGP) is proposed to address the aforementioned problems. The proposed technique involves pre-processing and sampling steps. In the pre-processing step, difficult to evolve data points by GP from the given data set are marked and in the sampling step, they are introduced in the evolutionary run by using two newly defined sampling techniques, called difficult points first and difficulty proportionate selection. These techniques are biased towards selecting difficult data points during the initial stage of a run and of easy points in the latter stage of a run. This ensures that GP does not ignore difficult-to-evolve data points during a run. Experiments have shown that GP coupled with DFS avoids premature convergence and attained higher fitness than standard GP using same fitness evaluations. Performance of the proposed technique was evaluated on three commonly known metrics, which are convergence speed, fitness and variance in the best results. Our results have shown that the proposed setups had achieved 10–15% better fitness values than Standard GP. Furthermore, the proposed setups had consistently generated better quality solutions on all the problems and utilized 30–50% less computations to match the best performance of Standard GP. Keywords Difficult first strategy · Genetic programming · Pre-processing · Sampling techniques · Dataset sampling · Machine learning

1 Introduction Genetic programming (GP) is an evolutionary algorithm that is used to evolve expressions by exploring and picking the best relation between the functions and the terminals. The exploration is a supervised learning process and is guided by the difference between the expected output and the output of the evolved expressions (error). Generally, this error is computed on complete input data set. Some of the data points are easy to evolve by GP, such as points on a straight line as * Hammad Majeed [email protected] Muhammad Quamber Ali [email protected] 1

Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad, Pakistan

compared to data points on a curved line. In this wor

Data Loading...

Difficult first strategy GP: an inexpensive sampling technique to improve the performance of genetic programming

Recommend Documents

Hierarchical data replication strategy to improve performance in cloud computing

Recapture Technique of Sampling

Predicting discharge coefficient of compound broad-crested weir by using genetic programming (GP) and artificial neural

An advanced modelling to improve the prediction of thermal distribution in friction stir welding (FSW) for difficult to

Genetic Programming

Exploratory Landscape Analysis is Strongly Sensitive to the Sampling Strategy

Groundwater sampling in karst terranes: passive sampling in comparison to event-driven sampling strategy

A novel technique to self-adapt parameters in parallel/distributed genetic programming

Statistical Genetic Programming: The Role of Diversity

International Strategy and Environment: An Assessment of the Performance Relationship

Handbook of Genetic Programming Applications

GP