Genetic Programming and Evolvable Machines manuscript No. (will be inserted by the editor) Evolutionary Hyper-Heuristics for Tackling Bi-Objective 2D Bin Packing Problems Juan Carlos Gomez · Hugo Terashima-Mar´ın Received: date / Accepted: date Abstract In this article, a multi-objective evolutionary framework to build selection hyper-heuristics for solving instances of the 2D bin packing prob- lem is presented. The approach consists of a multi-objective evolutionary learning process, using specific tailored genetic operators, to produce sets of variable length rules representing hyper-heuristics. Each hyper-heuristic builds a solution to a given problem instance by sensing the state of the in- stance, and deciding which single heuristic to apply at each decision point. The hyper-heuristics consider the minimization of two conflicting objectives when building a solution: the number of bins used to accommodate the pieces and the total time required to do the job. The proposed framework inte- grates three well-studied multi-objective evolutionary algorithms to produce sets of Pareto-approximated hyper-heuristics: the Non-dominated Sorting Ge- netic Algorithm-II, the Strength Pareto Evolutionary Algorithm 2, and the Generalized Differential Evolution Algorithm 3. We conduct an extensive ex- perimental analysis using a large set of 2D bin packing problem instances containing convex and non-convex irregular pieces, under many conditions, settings and using several performance metrics. The analysis assesses the ro- bustness and flexibility of the proposed approach, providing encouraging re- sults when compared against a set of well-known baseline single heuristics. Keywords Bin packing problem · evolutionary computation · hyper- heuristics · heuristics · multi-objective optimization · genetic algorithms Juan Carlos Gomez Department of Electronics, University of Guanajuato Campus Irapuato-Salamanca Salamanca, Mexico Tel.: +52-442-380-10-71 E-mail:

[email protected]

Hugo Terashima-Mar´ın School of Engineering and Sciences, Tecnol´ ogico de Monterrey Monterrey, Nuevo Leon, Mexico E-mail:

[email protected]

2 Juan Carlos Gomez, Hugo Terashima-Mar´ın 1 Introduction The bin packing problem (BPP) is a well-known combinatorial problem in operational research. The goal consists of fitting a finite number of pieces (having several shapes or attributes) into a minimum number of bins or objects (having a specific size, shape or attributes), where the task is subject to a set of practical and physical restrictions. There are many variants of this problem: one-, two- or three-dimensional packing, regular or irregular packing, on-line packing, packing by cost, by weight or by size. W¨ascher et al. [45] presented a complete typology of the problem in 2007 as an extension of the one introduced by Dyckhoff [14] in 1990. This problem has many industrial interests [25] given its many applications, for example, truck/ship/aircraft loading [49], computer memory assignment [19], garment and shoe industry [13, 11]; and ship building [33], where improvements of the arrangement can result in large savings [20]. In this work, we deal with instances of the 2D bin packing problem with convex and non-convex irregular pieces and single-size bins. The general so- lution for such problem is known to be NP-hard [16], and because of that several heuristics and approximation approaches have been proposed to solve it [26, 27]. Nevertheless, no reliable general heuristic that can optimally solve a large number of instances of this problem has been found. Most of the single heuristics work well only on a limited number of instances with determined structures and properties. Recently, hyper-heuristics (HHs) have emerged as a paradigm to select, combine, generate or adapt single heuristics to solve hard computational prob- lems [7, 38]. In particular some have been used to solve single-objective bin packing problems [40, 28, 30]. The idea behind a HH is to combine the strength and countervail the weakness of the individual heuristics by exploiting the knowledge from the current configuration of a problem in order to decide which heuristic to apply is. HHs remove the limitation of applying always one single heuristic for all types of problem instances, and instead, generalize the knowledge of the solution process in order to solve better a wide range of problem instances, independently of their specific properties. In the present paper, we extend the use of HHs for tackling a bi-objective version of the defined 2D bin packing problem. For solving this, we propose the eHH-MO (evolutionary Hyper-Heuristics Multi-Objective) framework, which is composed of two levels. The low level is a bi-objective optimization process where a given problem instance is solved considering two objectives: mini- mizing the number of bins used to allocate the pieces, and minimizing the time required to do that task. In our framework, we solve a low-level problem instance using a given HH. The high level is a bi-objective evolutionary op- timization process, which produces a set of Pareto-approximated eHHs (evo- lutionary hyper-heuristics), which are in charge of solving a wide range of problem instances at the low level. The two objectives at the high level are generalizations of the objectives at the low level, and thus the evolutionary process generates a diversity of eHHs that solve problem instances from a fast but wastrel manner to a expensive but fit manner. We solve the high-level op- Evolutionary Hyper-Heuristics 3 timization problem using a multi-objective evolutionary algorithm (MOEA). We have chosen three standard well-studied MOEAs [10] to be integrated here, namely the Non-dominated Sorting Genetic Algorithm-II (NSGA-II [12]), the Strength Pareto Evolutionary Algorithm 2 (SPEA2 [48]), and the Generalized Differential Evolution algorithm 3 (GDE3 [22]); having three eHH-MO models called eHH-NSGAII, eHH-SPEA2 and eHH-GDE3. In order to test the validity and robustness of the eHH-MO models, we conduct several experiments for comparing and analyzing the performance at the high-level problem of the three eHH-MO models against the use of single heuristics. In order to do that, we use a large set of benchmark problem in- stances (containing convex and non-convex pieces), and different performance metrics, settings and conditions. In addition, our analysis assesses robustness and flexibility of the framework through the use of the three MOEAs, which employ different strategies to evolve populations of eHHs, and in despite of this fact, the framework produces better results than the use of single heuristics. The contributions of our work are as follows: the extension of the hyper- heuristic paradigm to solve a diversity of instances (convex and non-convex) of the 2D bin packing problem under a bi-objective approach; the inclusion of several MOEAs within the eHH-MO framework for testing its flexibility, in despite of the strategies used to create the populations of eHHs; and the tai- lored adaptation of the mutation, cross-over and differential evolution genetic operators, in order to especially suit the generation of populations of eHHs inside the framework. The rest of this paper is organized as follows. Section 2 presents some related work. Section 3 defines the low- and high-level optimization problems tackled in this work and how the concept of HHs is used in hat context. Section 4 describes the complete eHH-MO framework. Section 5 presents the setup used for evaluating the models. Section 6 describes the results and their discussion. Finally, section 7 is devoted to conclusions and some ideas for future research. 2 Related Work Hyper-heuristics (HHs) differ from the widely-used term metaheuristics in the fact that instead of searching within the space of solutions, they explore a space of heuristics [42, 34, 36]. The idea is to use a variety of methods to discover al- gorithms, based on single heuristics, that have good worst-case performance across a range of problems and are fast in execution [37]. It is possible to dis- tinguish two types of HHs: selection HHs which are methods for choosing or se- lecting existing heuristics; and generation HHs, which focus on generating new heuristics from components of existing heuristics [7, 4]. The approach presented in this article is of the first type. Few works in the literature have addressed the issue of handling multi-objective problems within a hyper-heuristic paradigm. Burke et al. [9] applied the concept of multi-objective HHs to space allocation and timetabling problems by using a tabu-search-based approach. Veerapen et 4 Juan Carlos Gomez, Hugo Terashima-Mar´ın al. [44] used a two-phase HH method to the multi-objective travelling salesman problem. Rafique [35] described a multi-objective HH for tackling engineer- ing design problems. De Armas et al. [1] proposed and used a representation methodology to solve multi-objective guillotine cutting problems. Kumari et al. [23] presented a multi-objective HH combined with a genetic algorithm to solve software module clustering problems. More recently, Vazquez-Rodriguez and Petrovic [43] used multiple rank metrics to solve multi-objective optimiza- tion problems by combining different MOEAs (such as NSGA-II, SPEA2 and IBEA [47]). Bai et al. [2] presented a HH model for solving 2D shelf space allocation including multiple neighborhoods. Maashi et al. [31] introduced a learning selection choice function within a multi-objective HH that combines the strengths of three MOEA algorithms (NSGA-II, SPEA2 and MOGA [15]) using them as low-level heuristics. They tested their approach on the Walking Fish Group test suite consisting of nine multi-objective functions and on the vehicle crashworthiness design, a real world problem. In [18, 17], Gomez and Terashima-Mar´ın presented the initial attempt to test the use of HHs with the NSGA-II algorithm to solve a set of convex bin packing problems. The bin packing problem has been extensively studied, but a more inten- sive research has been carried out in the past twenty five years. It is known, that for all non-trivial BPP problems, enumerative search is impractical and the design of heuristic methods is necessary. We mention here recent works closely related to the investigation presented in this article. In the work by Burke et al. [6] a Genetic Programming (GP) System is presented for solving 2D strip packing, evolving constructive heuristics for deciding which piece to pick next and where to place it. Results show that a GP hyper-heuristic can be used to generate human competitive heuristics for solving the problem. An approach using Genetic programming for solving the bin packing and knapsack problems is presented by Burke et al. [8], with the main objective of exploring the possibilities of automatically generating heuristics. The same methodology is maintained regardless of the type of instance (one-, two-, three- dimensional problem) providing excellent results when compared against the best human- designed heuristics in the literature using a wide variety of test suites. Sim et al [39] describe a lifelong learning approach based on artificial immune systems, in which there is a continuous learning process for evolving both heuristics and problems in such a way that the existing knowledge is exploited to pro- duce fast solutions; it adapts to new problems with varying features; and it is capable of generalizing over the problems space. 1D Bin packing is one of the problems that has been used for testing in a large set of instances from literature and where results show not only excellent performance in terms of quality of solutions, but also with respect to self-adaptation, computational time and generalization. The article by Perumal et al. [3] mainly focuses on surveying the effectiveness and applicability of three categories of heuristic ap- proaches, namely heuristic and approximation algorithms, metaheuristics and hyper-heuristics, on 2D and 3D packing with both single and multi-objective optimization. A recent work by Martinez-Sykora et al. [32] presents a con- structive algorithm for solving in particular de 2D bin packing problem. The Evolutionary Hyper-Heuristics 5 contribution of this paper is the use of integer programing models to determine the association between pieces and bins and then applying a mixed integer programing method for placing the pieces into the bins. Additionally, It also includes a mechanism for rotating the pieces. Results show that the proposed approach obtains good quality performance when tested in sets of instances with different properties and types. 3 Definition of Problems In our framework, we have optimization problems at two levels. The low level intents to optimally solve instances of the 2D bin packing problem with con- vex and non-convex irregular pieces. This bin packing problem BP is de- fined as: given a set of n 2D, convex and non-convex, irregular pieces E = (e1 , e2 , . . . , en ), each one with an area a(ei ) ∈ (0, A0 ], allocate them into a minimum set B = {b1 , b2 , . . . , bb } of b rectangular bins of the same capac- ity (area) A0 , using a sequence (or combination) of heuristics, for instance H = {h5 → h2 → . . . → h1 }. Here H = {h1 , h2 , . . . , hf } is the set of all possi- ble single heuristics; H∗ is the set of all possible sequences of heuristics from H, with H ∈ H∗ and hi ∈ H. In the sequence H, each heuristic is applied one at a time, and the sequence H suffices to allocate all the pieces into the bins. The objective function for this problem is: b ∑ minimize z1 (BP) = wj (1) j=1 ∑q s.t. i=1 a(eji ) ≤ A0 where wj indicates the free space in bin j after allocating all pieces, a(eji ) is the area of piece i inside bin j, with the restriction that the sum of the areas of the q pieces assigned to bin j does not exceed the area of the bin. We go further and extend the framework to find solutions in terms of a second low-level objective: minimization of the time required to conduct the allocation task. For this objective we have the following function: |H| ∑ minimize z2 (BP) = t(hi ) (2) i=1 where t(hi ) is the application time of heuristic hi , and |H| is the number of heuristics needed to solve a problem instance. In this research we explore the use of the constructive and selective HH approach for solving instances of the described low-level problem, trying to provide solution process that is more general [5, 4, 37]. The central idea is to find combinations of single heuristics, such that this combination performs conveniently well across a large sets of problem instances, intending to com- pensate one heuristic’s drawbacks with the strengths of another [37]. One HH 6 Juan Carlos Gomez, Hugo Terashima-Mar´ın could be seen as a set of rules of the form state→heuristic, providing the fol- lowing recipe to solve problem instances: estimate the configuration state of the instance, select the best heuristic given the state and apply it, update the instance state. This estimation-application process is repeated until the instance is completely solved. The process outputs the set of heuristics used to solve the instance (for example H = {h5 → h2 → h1 }). On the other hand, when using directly a single heuristic, the output sequence to solve any instance will be always the same, independently of any configuration state. For example, if using heuristic 3 to solve instances, the output will always be H = {h3 → h3 → h3 }. The high-level optimization problem in the framework consists of building general HHs able to find low-level optimal solutions for a large number of instances of the bin packing problem. For doing this, we define two objective functions to evaluate each HH that add up over a set of instances as follows: l ∑ z1 (BP i )|HH i=1 F1 (HH) = (3) l l ∑ z2 (BP i )|HH i=1 F2 (HH) = (4) l where l is the number of instances solved so far by the HH, and BP i is an instance i. The objectives are conflicting between them: some HHs will solve problem instances using sequences of single heuristics that take cheap decisions to allocate a piece, having F2 → 0, but also having large waste of space; whereas other HHs will use sequences of single heuristics that take more expensive decisions, but reducing the waste of space to F1 → 0. We have thus a bi-objective optimization problem [10], in which a given solution HH from the set of all possible solutions HH is evaluated regarding to a vector of objective functions: F(HH) = [F1 (HH), F2 (HH)]. The general solution of this minimization problem consists in the identification of the set of Pareto-optimal solutions T, defined as follows: Definition 1 Pareto Dominance. F(HH) is said to dominate F(HH′ ) iff Fi (HH) ≤ Fi (HH′ ) ∀i = 1, . . . , d ∧ ∃i|Fi (HH) < Fi (HH′ ). The Pareto domi- nance of F(HH) over F(HH′ ) is denoted by F(HH) ≼ F(HH′ ). Definition 2 Pareto Optimality. The HH ∈ HH is said to be Pareto- optimal iff ¬HH′ ∈ HH|F(HH′ ) ≼ F(HH). The union of all these solutions forms the Pareto optimal set T. 4 Tackling Bi-Objective Bin-Packing Problem with Multi-Objective Evolutionary Hyper-heuristics In the eHH-MO (evolutionary Hyper-Heuristics Multi-Objective) framework, the low-level optimization problem is solved by a given eHH (evolutionary Evolutionary Hyper-Heuristics 7 hyper-heuristic), and in order to find the Pareto-approximated set (PAS) T of eHHs, which is the solution to the high-level optimization problem, we use two phases: a training one and a test one. In the training phase, a MOEA (which are methods well suited to solve multi-objective problems [10, 46]) evolves pop- ulations of eHHs to find a first set Ttrain . In the test phase, the Ttrain set is validated and filtered in order to find a final set T. We describe graphically both phases of the framework in Fig. 1. Training Phase Block representation eHHs (b) Genetic operators (d) Training High-level set MOEA objective (a) (c) functions Pareto- Problem approximated (e) instances set from training (a) Final Pareto- Test Validate approximated set and filter set (f) (g) Fig. 1 General model Test Phasethe training and the test phases. showing The whole process in the framework starts by splitting the set of problem instances into a training set and a test set (Fig. 1.a). The training phase starts with the creation at random of an initial population of eHHs (Fig. 1.b) for the MOEA. The MOEA evolves this population (Fig. 1.c) using its particular methodology, applying genetic operators (Fig. 1.d) to generate further pop- ulations, and using the training set of instances to evaluate the eHHs. The output of the MOEA’s evolutionary process (and of the training phase) is a PAS Ttrain (Fig. 1.e). An eHH is defined here as a set of variable number of blocks, where each block represents a state→heuristic rule. Each block consists of a vector of ten features expressing an instance state and one label that identifies a heuristic. The ten features lie in a normalized range of 0 and 1, and their description is given in Table 1. The used features are based on several geometrical properties (area, height, width, size and rectangularity) of the pieces to be allocated and were adopted from the work by L´opez-Camacho et al. [29]. In this research, a data mining process was used to select the most representative features, out of 23, to characterize an instance. The set has also been used in previous work to determine how these features relate with the performance of heuristic methods for solving the problem [28]. The fraction of the total items remaining is a feature that indicates how the solution to a given instance has progressed. Fig. 2 shows a graphical depiction of an eHH using the proposed representation. 8 Juan Carlos Gomez, Hugo Terashima-Mar´ın Fig. 2 Evolutionary hyper-heuristic block representation. single Features heuristic eHH { block 1 block 2 block r { 1 1 1 1 3 Feature 2 2 4 2 2 3 3 3 Description 4 Mean Mean 4 5 5 6 6 8 7 Number of remaining pieces. 4 5 6 7 9 8 10 label .area of the remaining pieces. . of the area of the remaining pieces. Variance . of the rectangularity 7 8 9 of 10 the label 9 10 remaining pieces. label 5 Variance of the rectangularity of the remaining pieces. 6 Mean of the height of the remaining pieces. 7 Variance of the width of the remaining pieces. 8 Fraction of remaining pieces whose area is above 12 of the bin area. 9 Mean of the degree of concavity of the remaining pieces. 10 Fraction of the total items remaining. Table 1 Description of each feature in the eHH representation. An eHH is a constructive approach that builds a complete solution at the low level for a given problem instance. This approach is described in Fig. 3. The process starts with a set of pieces, a set of bins and the eHH. It computes first the vector of 10 geometrical features that describe the current instance’s configuration state (Fig. 3.a.1). Secondly, it finds the closest vector inside the eHH and the corresponding attached heuristic (Fig. 3.a.2 and 3.a.3) using Eu- clidean distance. Finally, it applies the found heuristic (Fig. 3.b), and records the heuristic as part of the solution (Fig. 3.c). The process is repeated until all pieces bin eHH (a) { 0.23 0.87 0.67 0.33 0.92 0.78 0.85 0.51 0.14 0.45 2 0.87 0.41 0.97 0.84 0.47 0.01 0.19 0.99 0.72 0.05 26 0.28 0.12 0.09 0.39 0.82 0.52 0.27 0.34 0.61 0.80 5 (a.1)current instance state 0.41 0.12 0.47 0.05 0.53 0.32 0.10 0.88 0.59 0.87 (a.2) nearest (a.3) selected point heuristic (b) Apply selected heuristic (c) Record selected heuristic pieces bin (d) new instance state 0.41 0.23 0.64 0.05 0.53 0.23 0.19 0.77 0.49 0.81 Fig. 3 Solution process for a bin packing problem instance using an eHH. Evolutionary Hyper-Heuristics 9 the pieces have been allocated (after applying one heuristic the problem state changes, as shown in Fig. 3.d). The output is the sequence of heuristics used to solve the problem instance. The framework creates each eHH in the initial population of the MOEA by generating a random number of blocks (between 5 and 20, selected empirically in previous studies). The features in each block are in turn generated as random numbers between 0 and 1, and the label as a random integer between 1 to 40 (the number of single heuristics used, which are further explained below). The minimum number of blocks ensures the inclusion of a diversity of single heuristics, whereas the maximum restricts the construction of very large eHHs that could contain non-useful rules. Inside the eHH-MO framework we implemented three well-studied MOEAs: the Non-dominated Sorting Genetic Algorithm-II (NSGA-II), the Strength Pareto Evolutionary Algorithm 2 (SPEA2), and the third version of the Gen- eralized Differential Evolution algorithm (GDE3). We have thus three models as instantiations of our general framework: eHH-NSGAII, eHH-SPEA2 and eHH-GDE3. The mentioned MOEAs share some strategies and differ in others, but all of them produce as output a PAS Ttrain . One major difference between the used MOEAS is that NSGA-II and SPEA2 rely on mutation and crossover with certain probabilities, m and c respectively, to create new populations of indi- viduals, whereas GDE3 uses differential evolution (DE) [24] for that purpose. DE consists essentially on mixing, with a probability c, scaled proportions of three individuals. A second difference is that NSGA-II and GDE3 use fronts to rank the individuals in the population, based on non-dominance and crowd- ing distance, and SPEA2 sort the individuals using strength raw fitness plus a density estimation. The MOEAs use this sorting to select the best individuals as parents for the next generation. Crowding distance measures how close an individual is regarding the population. Strength raw fitness of an individual estimates how many individuals it dominates and by how many is dominated. Density estimation indicates how close the individual is to its neighbors. The last major difference is that SPEA2 uses an archive to keep tracking of non- dominated individuals and to create new populations, whereas NSGA-II and GDE3 do not use and archive, and produce new individuals from the previ- ous parent population. For details of how each MOEA works, we refer to the corresponding original publication. The rationale to choose the mentioned MOEAs is that they are well stud- ied and present a diversity of strategies to evolve populations of Pareto- approximated solutions. We stress that the goal in the current paper is not to test the performance of individual MOEAs, but rather the robustness and flexibility of our framework, in such a way that we could integrate in it any type of MOEA, producing good results, independently of the strategies each MOEA follows to evolve its populations. The condition is that the MOEAs should be able to work with the block representation the eHHs use. Regarding the genetic operators, we tailored two versions (block-level and internal-level) of crossover, mutation and the DE operator to work with the 10 Juan Carlos Gomez, Hugo Terashima-Mar´ın block representation of the eHHs. The block-level crossover operator exchanges a random number of complete blocks between two randomly selected parent eHHs, in order to create two children eHHs. This random number must be less than the number of blocks in the shortest eHH. The block-level mutation operator deletes a random block from a randomly selected eHH, or it creates and inserts a new block at random in the eHH. The operator does not delete or add blocks if the number of blocks inside the selected eHH is out of the defined range of 5 to 20. The block-level DE operator copies each parent eHH and then replace its complete blocks, with a certain probability, with the mixtures of complete random blocks of three randomly selected eHHs multiplied by a scaling factor. The internal-level crossover operator interchanges a random number of feature values between a random number of blocks from two random eHHs. The internal-level mutation operator mutates a random number of feature values of a random number of blocks from a randomly selected eHH. The internal-level DE operator copies each parent eHH and replaces the internal feature values of each block, with a given probability, as the mixture of feature values of random blocks from three randomly selected eHHs multiplied by a scaling factor. We show examples of the crossover, mutation and DE operators in Fig. 4, Fig. 5 and Fig. 6. Every time the DE operator is applied, the values of features (between 0 and 1) and labels (between 1 and 40) are checked to satisfy their corresponding restrictions. (a) Block level cross-over { { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 0.30 0.17 0.53 0.39 0.41 0.52 0.08 0.97 0.32 0.73 15 crossed eHH1 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 eHH1 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 0.76 0.07 0.17 0.14 0.72 0.02 0.12 0.39 0.63 0.34 38 { { 0.55 0.12 0.33 0.35 0.18 0.20 0.27 0.18 0.76 0.94 4 0.55 0.12 0.33 0.35 0.18 0.20 0.27 0.18 0.76 0.94 4 crossed eHH2 0.30 0.17 0.53 0.39 0.41 0.52 0.08 0.97 0.32 0.73 15 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 eHH2 0.76 0.07 0.17 0.14 0.72 0.02 0.12 0.39 0.63 0.34 38 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 (b) Internal level cross-over { { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 0.55 0.26 0.51 0.35 0.22 0.38 0.27 0.18 0.18 0.94 2 crossed eHH1 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 eHH1 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 0.18 0.17 0.30 0.39 0.16 0.52 0.32 0.34 0.65 0.13 15 { { Fig. 4 Crossover operator 0.55 0.12 0.33 0.35 at0.18 0.18 0.20 0.27 block level 0.76 0.94 4 (a) and at 0.33 0.23 0.12 internal level 0.13 0.18 0.20 0.27 (b). 0.51 0.76 0.56 4 crossed eHH2 0.30 0.17 0.53 0.39 0.41 0.52 0.08 0.97 0.32 0.73 15 0.30 0.52 0.53 0.04 0.41 0.48 0.08 0.97 0.32 0.73 5 eHH2 0.76 0.07 0.17 0.14 0.72 0.02 0.12 0.39 0.63 0.34 38 0.76 0.07 0.17 0.14 0.72 0.02 0.12 0.39 0.63 0.34 38 During training, when a new eHH is created (in the initial population or after using the genetic operators) it is evaluated by presenting to it l = 5 problems selected at random from the training set. For every problem solved, the low-level objective equations 1 and 2 are estimated, and then summed in the two high-level objective equations 3 and 4 to obtain the initial performance of the new eHH. The number l = 5 has been empirically found to provide a Evolutionary Hyper-Heuristics 11 (c) Block level mutation (c.1) Delete a random selected block { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 mutated eHH 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 eHH 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 (c.2) Add a random created block { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 mutated eHH 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 eHH 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 0.28 0.12 0.60 0.12 0.31 0.48 0.09 0.64 0.48 0.97 18 (d) Internal level mutation Fig. 5 0.23 Mutation operator at block level (c) 0.13 and0.26at0.51internal level (d). Deletion of a random { { 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 0.23 0.32 0.38 0.27 0.51 0.18 0.56 40 selected0.47 block (c.1) and addition of a random created block (c.2). mutated eHH 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 0.47 0.41 0.23 0.54 0.78 0.12 0.07 0.89 0.78 0.56 29 eHH 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 (a) Block level differential evolution { { 0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 eHH1 0.47 0.41 0.12 0.54 0.27 0.12 0.07 0.89 0.25 0.42 26 0.18 0.52 0.30 0.04 0.16 0.48 0.32 0.34 0.65 0.13 5 mixed blocks { { 0.37 0.16 0.01 0.36 0.62 0.85 0.72 0.15 0.28 0.16 23 0.45 0.96 0.19 0.39 0.48 0.73 0.51 0.01 0.31 0.26 13 to replace eHH2 0.71 0.65 0.22 0.41 0.71 0.27 0.09 0.08 0.51 0.27 6 0.08 0.12 0.82 0.48 0.12 0.65 0.71 0.94 0.56 0.83 35 parent 0.08 0.21 0.32 0.43 0.65 0.18 0.02 0.14 0.05 0.03 15 eHH { 0.13 0.60 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 eHH3 0.76 0.01 0.02 0.14 0.17 0.28 0.97 0.19 0.20 0.52 18 (b) Internal level differential evolution Block from {0.23 0.26 0.51 0.13 0.22 0.38 0.27 0.51 0.18 0.56 2 { mixed eHH1 block Block to replace from {0.71 0.65 0.22 0.41 0.71 0.27 0.09 0.08 0.51 0.27 6 0.88 0.22 0.38 0.24 0.64 0.12 0.87 0.24 0.05 0.33 25 {a block eHH2 from Block parent from {0.76 0.01 0.02 0.14 0.17 0.28 0.97 0.19 0.20 0.52 18 eHH eHH3 Fig. 6 Differential evolution operator at block level (a) and at internal level (b). good estimation of the performance, while keeping the computation time still feasible. Additionally, if during the MOEA’s evolutionary process an eHH from the parent population survives for the next generation, it has to solve a new problem selected at random from the training set, and its values for the high- level objective functions are recalculated as follows: (Fig−1 (eHH)np )+zi (eHH,trnew ) Fig (eHH) = np +1 , i = 1, 2 (5) where Fig (eHH) is the value of the i-th high-level objective function for the generation g, np is the number of problems the eHH has solved so far and zi (eHH, trnew ) is the value of the i-th low-level objective function for the new 12 Juan Carlos Gomez, Hugo Terashima-Mar´ın instance. By solving new instances, we estimate how well the eHHs are gener- alizing the knowledge of the solution process. The output of the training phase is a PAS Ttrain = {eHH1 , eHH2 , . . . , eHHk }. In the test phase of the eHH-MO framework, it evaluates each eHH of the training PAS for generality by solving with it all the problem instances in the previously unseen test set. It computes the two low-level objective functions for each test instance solved, adds all the values, and averages them by the size of the test set to obtain the values for the two high-level objective functions. After evaluating all the eHHs it performs a non-dominance filtering. The out- put of the test phase is a final PAS T, containing the eHHs from the training phase that were able to generalize the knowledge of the solution process on the test phase. 4.1 Single Heuristics For the defined bin packing problem, single heuristics are criteria to select the next piece(s) to be placed, the corresponding bin to place it (them), and to determine the position of the piece(s) inside the bin. One single heuristic is a combination of two sub-heuristics: one for selection and one for placement. The selection sub-heuristics used in this work are presented below, and described in more detail in [40] and [26]. – First Fit (FF): Considers the opened bins in a fixed order and place the next piece in the first one it fits. – First Fit Decreasing (FFD): Sorts pieces by area and the largest one is placed according to FF. – First Fit Increasing (FFI): Sorts pieces by area and the smallest one is placed according to FF. – Filler + FFD: Places as many pieces as possible within the open bins. If it has placed at least one piece, the algorithm stops, otherwise it applies the FFD algorithm. – Next Fit (NF): Puts the next piece in the current bin if it fits, if not opens a new one and places the piece there. – Next Fit Decreasing (NFD): Sorts the pieces by area, and places the largest one according to NF. – Best Fit (BF): Places the piece in the opened bin where it best fits (i.e. with the minimum waste). – Best Fit Decreasing (BFD): Same as the previous one, but it sorts the pieces in decreasing order. – Worst Fit (WF): Places the piece in the opened bin where it worst fits (i.e. with the largest waste). – Djang and Fitch (DJD): Places pieces in a bin, taking pieces by de- creasing size until the bin is at least one third full. It then initializes wt (allowed waste), and looks for combinations of 1 to 5 pieces producing a waste wt. If any combination fails, it increases wt by a small percentage Evolutionary Hyper-Heuristics 13 of the bin size. It opens a new bin if wt reaches the limit of the allowed amount of waste space. The placement sub-heuristics used in this work, described in detail in [40], are the following: – Bottom Left (BLI): It puts the piece at the top right of the bin and slides it down and left sequentially until the piece touches a boundary (the bin or another piece). – Constructive Approach (CA): It puts a first piece at the bottom left of the bin. Then, it places the next piece in one of the five positions: (x, 0), (0, y), (x, y), (x, y) and (x, y), where x, y, x and y are the maximum and minimum coordinates in relation to the first piece. From each position it slides the next piece down and left, and it chooses the position that places the piece deepest. – CA with Minimum Area (CAA): It selects the best position from the previous heuristic based on which one yields the deepest bounding rectangle that contains all the pieces with minimum area. – CA with Maximum Adjacency (CAD): It considers as positions for the first piece only the four corners of the bin. For subsequent pieces, po- sitions are the same as in CA. For each position, it computes the piece’s adjacency (the common boundary between its perimeter and the placed pieces and the bin edges). Then, it slides the piece down and left, and computes the adjacency again. It selects the position with the largest ad- jacency. There are 40 single heuristics (see Table 2) by combining the previous selection and placement sub-heuristics. 5 Setup for Experimental Evaluation We use for experimentation a set of 26 different types of artificially gener- ated bin-packing problem instances with convex (18 types) and non-convex (8 types) irregular pieces. We generated the instances using the methodologies presented in [40, 26]. There are 30 instances per type of problem, totaling 780 instances: 540 containing convex pieces, and 240 containing non-convex pieces. The pieces in all of the instances are polygons with three to eight sides. The characteristics of the problem instances are shown in Table 3. Instances of types A to R are convex; instances of types S to Z are non-convex. There is a wide variety of features in the set: from instances containing small pieces (average size of 1/30 of the bin size) to instances containing huge pieces (av- erage size of 1/3 of the bin size). The average piece rectangularity goes from 0.45 to 1. The lower the rectangularity of a piece, the more irregular the piece is, and the more irregular pieces an instance has, the more difficult to solve is. The degree of concavity is the concaveness of the largest internal angle of a non-convex piece. The convex hull of a piece is the area of the smallest convex 14 Juan Carlos Gomez, Hugo Terashima-Mar´ın Single Sub-heuristic Single Sub-heuristic heuristic Selection Placement heuristic Selection Placement 1 FF BLI 21 NFD BLI 2 CA 22 CA 3 CAA 23 CAA 4 CAD 24 CAD 5 FFD BLI 25 BF BLI 6 CA 26 CA 7 CAA 27 CAA 8 CAD 28 CAD 9 FFI BLI 29 BFD BLI 10 CA 30 CA 11 CAA 31 CAA 12 CAD 32 CAD 13 Filler+FFD BLI 33 WF BLI 14 CA 34 CA 15 CAA 35 CAA 16 CAD 36 CAD 17 NF BLI 37 DJD BLI 18 CA 38 CA 19 CAA 39 CAA 20 CAD 40 CAD Table 2 List of possible single heuristics as a combination of one selection and one place- ment sub-heuristic polygon that contains the piece. The optimum number of bins to fit the pieces ranges from 2 to 15. Instances of type G have an unknown optimal number of bins, since they were produced using random alterations to problems with known optimum1 . We designed eight experiments by arranging the problem instances into different training and test sets: 1. We mixed, shuffled, and split instances of non-convex type in two set of 120 instances each. We used the first set for training and the second for testing. 2. We exchanged the training and test sets from the first experiment. 3. We mixed, shuffled, and split instances of convex type in two sets of 270 instances each. We used the first set for training and the second for testing. 4. We exchanged the training and test sets from the third experiment. 5. We mixed, shuffled, and split all instances in two balanced sets of 390 instances each. We used the first set for training and the second for testing. 6. We exchanged the training and test sets from the fifth experiment. 7. We used all non-convex instances as the training set, and all convex in- stances as the test set. 8. We exchanged the training and test sets from the seventh experiment. We designed the experiments to test the models under different circum- stances, trying to avoid that properties of a certain type of instances may dominate the solution process. 1 The set of problem instances used in this work can be found at http://paginas.fe.up.pt/~esicup/datasets Terashima1-Terashima2 Evolutionary Hyper-Heuristics 15 Average Average of Average of Number of Number of rectan- concavity ratio area instances pieces gularity degree /convex hull Optimum Conv. A 30 30 0.70 1 1 3 B 30 30 0.87 1 1 10 C 30 36 0.68 1 1 6 D 30 60 0.57 1 1 3 E 30 60 0.41 1 1 3 F 30 30 0.59 1 1 2 G 30 36 0.87 1 1 unknown H 30 36 0.86 1 1 12 I 30 60 1.00 1 1 3 J 30 60 0.83 1 1 4 K 30 54 0.63 1 1 6 L 30 30 0.51 1 1 3 M 30 40 0.55 1 1 5 N 30 60 0.62 1 1 2 O 30 28 0.57 1 1 7 P 30 56 0.49 1 1 8 Q 30 60 0.89 1 1 15 R 30 54 0.63 1 1 9 N-Conv. S 30 17-20 0.45 1.16 0.918 2 T 30 30-40 0.60 1.24 0.916 10 U 30 20-33 0.55 1.19 0.888 5 V 30 15-18 0.62 1.09 0.936 5 W 30 24-28 0.78 1.12 0.931 4 X 30 25-39 0.66 1.17 0.895 3 Y 30 40-50 0.61 1.09 0.943 6 Z 30 60 0.54 1.09 0.940 12 Table 3 Features of the set of problem instances used for experimentation. Single Single Single Single heuristic Score heuristic Score heuristic Score heuristic Score 1 1.0E-3 11 4.8E-2 21 3.4E-4 31 7.1E-2 2 4.9E-2 12 5.2E-2 22 2.1E-2 32 7.7E-2 3 4.7E-2 13 3.5E-3 23 2.1E-2 33 4.9E-4 4 5.9E-2 14 3.2E-1 24 2.4E-2 34 2.5E-2 5 8.9E-4 15 3.3E-1 25 8.3E-4 35 2.5E-2 6 6.6E-2 16 3.1E-1 26 4.9E-2 36 3.0E-2 7 6.9E-2 17 3.4E-4 27 4.4E-2 37 3.6E-2 8 7.3E-2 18 1.5E-2 28 5.4E-2 38 1.0E+0 9 7.9E-4 19 1.5E-2 29 8.9E-4 39 9.2E-1 10 4.6E-2 20 1.8E-2 30 7.1E-2 40 9.0E-1 Table 4 Normalized time scores for the single heuristics. The time used in the low-level objective equation 2 is a time score that indicates how expensive a single heuristic is to perform its selection and place- ment tasks. The scores correspond to the real times of the heuristics after solving all of the 780 problem instances, and then normalized over the maxi- mum real time of all single heuristics. The scores for the single heuristics are shown in Table 4, with a score of 1 indicating the most expensive heuristic. Every time an eHH uses a given heuristic, the framework adds the time score of this heuristic. When computing the second high-level objective, it averages the total score by the number of instances the eHH has solved. When the training and test phases are completed, it normalizes the second high-level objective by the maximum value for this objective for a given experiment. 16 Juan Carlos Gomez, Hugo Terashima-Mar´ın We performed 10 independent runs of the complete process with each one of the three models eHH-NSGAII, eHH-SPEA2 and eHH-GDE3, for each one of the eight experiments. The settings used in all the runs were 60 individuals, 100 generations, crossover (and DE in eHH-GDE3) probability of 0.9, mutation probability of 0.1, eHH-SPEA2s archive size of 60 and eHH-GDE3s scaling fac- tor of 0.5. We defined these parameters empirically as averages of parameters of all the methods over a validation set using grid search and optimizing over the hypervolume metric (see next section). The complete eHH-MO framework with all the individual methods was implemented in Java, we conducted all the experiments on a desktop PC with a 3.4Ghz Core i7 processor and 16Gb in RAM. 5.1 Baseline Model We assess the performance of the eHH-MO models against a baseline model consisting on using directly the single heuristics (SH model). The SH model does not have a training phase, but it works directly over the set of test problem instances. The SH model solves all the test instances with every single heuristic, adding the values for the two low-level objective functions for every instance the heuristic solves, and dividing the totals by the size of the test set to obtain the values for the high-level objective functions. Once the SH model evaluates all the single heuristics, it runs a non-dominance filtering to form a PAS T. Contrary to the eHH-MO models, the PAS produced by the SH model for a given test set does not change in every run, since this model always solves the problem instances in a deterministic way. 5.2 Metrics for Evaluation Each PAS T produced by a model (eHH-MO models or the SH model) can be expressed also as a set of projected points in the space of the high-level ob- jective functions as F = {F1 , F2 , . . . , Fk }, where Fi = [F1 (eHHi ), F2 (eHHi )]. The values of the objective functions are always normalized to be between 0 and 1. A set F of points represents a Pareto-approximated Front (PAF). We evaluate each PAF produced by a model using three performance met- rics [21]: generational distance (GD) [12, 41] (convergence metric), hypervol- ume (HV) [50] (convergence-diversity metric), and ∆ [12] (distribution and spread metric). The GD metric measures the degree of proximity based on the distance between the points in F to those in the reference set RF. It is defined as: ∑|F| ( dqi )1/q GD(F, RF) = i=1 |F| , where di = min ∥Fi − RF∥, Fi ∈ F and q = 2. RF∈RF Thus di is the smallest Euclidean distance from (∪Fi to the closest point ) in RF. |F| The HV is defined as: HV(F, RP) = L i=1 HV(Fi ) : Fi ∈ F , and it measures the volume of objective function space covered by a PAF F, bounded Evolutionary Hyper-Heuristics 17 by a reference point RP. L refers to Lebesgue measure. We define in this work the reference point as RP = [1, 1], since the values of the objective functions are normalized, and the objectives are to be minimized. Bigger values of HV are better. The ∆ metric considers the distribution and spread of the PAF F simul- ∑|F|−1 df +dl + i=1 |di −d| taneously. It is defined as: ∆(F, RF) = df +dl +(|F|−1)d , where di is the Euclidean distance between consecutive points and d is the average of di . The terms df and dl are the minimum Euclidean distances from points in F to the extreme points of the reference set RF. Smaller values of ∆ are better. As reference set RF for GD and ∆ we use here the non-dominated com- bination of point sets from the PAFs of all the runs of all models of all the experiments. After computing the metrics, we apply pair-wise Kruskal-Wallis statistical significance tests among all the models over the metric values per experiment. We use a significance level of α = 0.05 and the Bonferroni correction. We did the implementation of the metrics based on the code of the MOEA framework2 , and we conducted the statistical tests with Matlab. 6 Results Fig. 7 shows the values of the three metrics: GD, HV and ∆ for the eHH-MO models and the SH model for all the experiments. We summarize the values of the metrics along the different runs as box plots. Higher values for HV, and lower values for GD and ∆ indicate better performances. In this figure we observe similar general patterns of performance across metrics for the models. As a general trend, in all the metrics the eHH-MO models show a better performance than the SH model. There are some de- viations where the SH model has a bigger overlap in performance with the eHH-MO models, such as in experiment 3 for GD; experiments 1, 4, 5, 6 and 8 for HV and experiments 3, 4, 6 and 7 for ∆. But even in such cases, the eHH-MO models could perform at least as good as the SH model and are generally better. The good performance of the eHH-MO models is maintained even in experiments 7 and 8, where we exclude completely a type of problem instances (convex or non-convex) from the training phase. This scenario is the most complicated, since the models do not have complete information about the types of problems to solve when building the rules of the solution process. When comparing the box plots of the eHH-MO models among them, we ob- serve that eHH-SPEA2 tends to obtain better values for GD and HV than the other two models, while eHH-GDE3 tends to have better values for ∆. Table 5 shows results of the pair-wise Kruskal-Wallis tests conducted over the estimated values for the metrics of all the models. Each row represent one model, and the columns are split in 4 blocks of 3 columns. Each block corresponds to a model and each column of the block to a given estimated 2 Available at: http://moeaframework.org/index.html 18 Juan Carlos Gomez, Hugo Terashima-Mar´ın (A) Generational Distance Experiment 1 Experiment 2 Experiment 3 Experiment 4 0.01 0.03 0.01 0.02 0.02 0.01 GD GD GD GD 0.01 0.00 0.00 0.00 0.00 eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 Experiment 5 Experiment 6 Experiment 7 Experiment 8 0.03 0.03 0.03 0.02 0.02 0.02 0.02 0.01 GD GD GD GD 0.01 0.01 0.01 0.00 0.00 0.00 0.00 eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 (B) Hypervolume Experiment 1 Experiment 2 Experiment 3 Experiment 4 0.68 0.65 0.65 0.62 Hypervolume Hypervolume Hypervolume Hypervolume 0.67 0.64 0.64 0.61 0.66 0.63 0.63 0.60 0.62 0.65 0.62 eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 Experiment 5 Experiment 6 Experiment 7 Experiment 8 0.65 0.64 0.63 0.66 0.65 Hypervolume Hypervolume Hypervolume Hypervolume 0.64 0.63 0.62 0.64 0.63 0.63 0.62 0.61 0.62 0.62 eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 (C) ∆ Experiment 1 Experiment 2 Experiment 3 Experiment 4 1.3 1.2 1.2 1.3 1.2 1.1 1.2 1.1 1.0 1.1 1.1 0.9 1.0 0.8 1.0 1.0 0.7 0.9 0.9 0.9 0.6 0.5 0.8 0.8 0.8 eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 Experiment 5 Experiment 6 Experiment 7 Experiment 8 1.2 1.4 1.3 1.3 1.2 1.3 1.2 1.1 1.1 1.2 1.1 1.0 1.0 1.1 0.9 1.0 0.9 0.8 1.0 0.9 0.7 0.8 0.9 0.8 eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH eHH- eHH- eHH- SH NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 Fig. 7 Box plots for generational distance (A), hypervolume (B) and ∆ (C) for all the models for all the experiments. Evolutionary Hyper-Heuristics 19 Better than⇒ eHH-NSGAII eHH-SPEA2 eHH-GDE3 SH ⇑ GD HV ∆ GD HV ∆ GD HV ∆ GD HV ∆ eHH-NSGAII – – – ∅ ∅ ∅ ∅ ∅ ∅ 1,2,4,6-8 2,3,7 1,5 eHH-SPEA2 ∅ 4 ∅ – – – 1,3,4,7 1,3-7 ∅ 1-7 1-8 1 eHH-GDE3 ∅ ∅ ∅ ∅ ∅ 3 – – – 5 2,3 1-3,8 SH ∅ ∅ ∅ ∅ ∅ ∅ ∅ ∅ ∅ – – – Table 5 Results of the pair-wise Kruskal-Wallis tests among all the runs of all the models. metric for that model. The numbers in each cell of the table indicate the experiments where a model from the rows performs significantly better than models in the columns for the different metrics. The ‘-’ symbol indicates a range of values (i.e. 1-7 means from 1 to 7). These tests confirm that across metrics the eHH-MO models perform significantly better than the SH model in several experiments for the different metrics. Regarding the comparison among eHH- MO models, eHH-SPEA2 performs significantly better than eHH-GDE3 for GD and HV in some experiments, while eHH-GDE3 significantly outperforms eHH-SPEA2 for ∆ in one experiment Since all runs of the eHH-MO models are independent, we summarize the PASs of these models by merging all solutions of the different PASs per run, per model, and per experiment, and we apply a non-dominance filtering to the summarized PASs. We show in Fig. 8 the corresponding (summarized) PAFs for all the models. The points in the summarized PAFs (and the plots) represent the projection of the solutions of each model’s summarized PASs (eHHs for the eHH-MO models and single heuristics for the SH model) to the space of high-level objective functions. In this figure, we see similar patterns for the summarized PAFs along experiments. In the experiments, the eHH- MO models produce PAFs with more solutions and covering more space of the objective functions than the SH model. This means that the eHH-MO models offer more alternatives for solving the problem instances. We also see that the solutions found by the eHH-MO models clearly dominate the solutions of the SH model. When comparing the PAFs of the eHH-MO models among them, we observe that eHH-SPEA2 tends to have more points in its PAFs than the other two models. Table 6 shows the metric values for the summarized PAFs of Fig. 8. In this table we see that the eHH-MO models present better values than SH for GD and HV, but not for ∆. This is contrary to what we observe in the box plots of Fig. 7, but this is due to the fact that when combining solutions from different runs, part of the good distribution of points is lost. Nevertheless, as pointed out by Jiang et al. [21], the number of solutions should be the first metric to compare the outputs of models in multi-objective optimization problems. For checking that, as a complement of previous figure and table, Table 7 shows a comparison of the number of solutions that form the summarized PASs of all the models along experiments; together with the average number of blocks (and its standard deviation) inside the eHHs members of the summarized PASs for the eHH-MO models. We observe that the summarized PASs for the eHH- MO models contains more than twice the number of solutions than the PASs 20 Juan Carlos Gomez, Hugo Terashima-Mar´ın Experiment 1 Experiment 2 1.0 eHH-NSGAII 1.0 eHH-NSGAII eHH-SPEA2 eHH-SPEA2 eHH-GDE3 eHH-GDE3 Allocation Time 0.8 Allocation Time 0.8 SH SH 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.3 0.4 0.5 0.6 0.7 0.8 0.3 0.4 0.5 0.6 0.7 0.8 Waste of Space Waste of Space Experiment 3 Experiment 4 1.0 eHH-NSGAII 1.0 eHH-NSGAII eHH-SPEA2 eHH-SPEA2 eHH-GDE3 eHH-GDE3 Allocation Time 0.8 Allocation Time 0.8 SH SH 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.3 0.4 0.5 0.6 0.7 0.8 0.3 0.4 0.5 0.6 0.7 0.8 Waste of Space Waste of Space Experiment 5 Experiment 6 1.0 eHH-NSGAII 1.0 eHH-NSGAII eHH-SPEA2 eHH-SPEA2 eHH-GDE3 eHH-GDE3 Allocation Time 0.8 Allocation Time 0.8 SH SH 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.3 0.4 0.5 0.6 0.7 0.8 0.3 0.4 0.5 0.6 0.7 0.8 Waste of Space Waste of Space Experiment 7 Experiment 8 1.0 eHH-NSGAII 1.0 eHH-NSGAII eHH-SPEA2 eHH-SPEA2 eHH-GDE3 eHH-GDE3 Allocation Time 0.8 Allocation Time 0.8 SH SH 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.3 0.4 0.5 0.6 0.7 0.8 0.3 0.4 0.5 0.6 0.7 0.8 Waste of Space Waste of Space Fig. 8 Summarized PAFs of all the models. The x-axis indicates objective 1 and the y-axis objective 2. Evolutionary Hyper-Heuristics 21 GD HV ∆ eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- Exp. NSGAII SPEA2 GDE3 SH NSGAII SPEA2 GDE3 SH NSGAII SPEA2 GDE3 SH 1 0.001 0.001 0.003 0.008 0.649 0.658 0.641 0.638 0.954 1.216 1.331 1.149 2 0.001 0.000 0.002 0.007 0.677 0.680 0.672 0.658 1.052 1.221 1.256 1.113 3 0.003 0.007 0.014 0.010 0.644 0.664 0.642 0.626 1.363 1.235 1.259 1.074 4 0.004 0.001 0.007 0.027 0.615 0.623 0.609 0.606 1.219 1.355 1.236 0.944 5 0.007 0.008 0.003 0.010 0.652 0.653 0.637 0.630 1.179 1.187 1.357 1.110 6 0.002 0.005 0.004 0.010 0.636 0.647 0.633 0.622 1.372 1.303 1.338 1.087 7 0.004 0.003 0.005 0.026 0.629 0.632 0.622 0.608 0.958 1.267 1.381 0.929 8 0.001 0.007 0.002 0.008 0.661 0.666 0.658 0.648 1.296 1.326 1.360 1.147 Av. 0.003 0.004 0.005 0.013 0.645 0.653 0.639 0.630 1.174 1.264 1.315 1.069 Table 6 Metric values for the summarized PAFs of all the models. eHH-NSGAII eHH-SPEA2 eHH-GDE3 SH Blocks Blocks Blocks Exp. Sols. Sols. Sols. Sols. (Avg±Std) (Avg±Std) (Avg±Std) 1 45 8.8±3.1 58 8.3±2.7 29 9.2±3.5 14 2 38 7.6±2.4 61 9.6±2.8 22 7.5±2.5 13 3 42 9.4±2.9 64 9.6±3.1 23 9.6±4.0 13 4 36 7.1±1.8 54 8.8±3.0 30 9.1±3.2 13 5 40 9.0±3.3 62 9.0±3.0 28 8.4±2.9 15 6 47 9.8±3.0 58 9.1±3.0 28 9.1±4.2 13 7 37 9.4±3.3 43 8.3±2.8 32 9.7±3.7 13 8 31 8.3±2.9 55 7.2±2.5 31 10.4±3.5 14 Avg 40 8.7±2.8 57 8.7±2.9 28 9.1±3.4 14 Table 7 Number of solutions in the summarized PASs for all the models, and average number of blocks (and standard deviation) inside the eHHs of the eHH-MO models. of SH. When comparing the eHH-MO models among them, we corroborate what we see in the summarized PAFs in Fig. 8: eHH-SPEA2 produces more solutions than the other two models. In this table we also observe that the eHH-MO models produce eHHs with similar number of blocks (rules). Fig. 9 shows the frequency of usage of the different single heuristics by the eHHs that are members of the summarized PASs of the eHH-MO models along the experiments. We scaled the frequencies in the plots by the number of solutions produced by each model. We observe that the most used single heuristics in eHH-NSGAII are 5, 8, 21, 29, and 32, with other heuristics like 13, 17, 33 and 40 being used in less degree. In eHH-SPEA2 the most used single heuristics are 5, 8, 13, 21, 29, 32, 33 and 37, with other heuristics like 17 and 40 being used in less degree. Finally, in eHH-GDE3 the most used heuristics are 8, 21, 29, 32, and 33, with other heuristics like 1, 5, 13, 16, 25 and 40 being used in less degree. We observe that there exist similar distributions and diversity of usage of single heuristics among the eHH-MO models, with eHH-NSGAII and eHH-GDE3 sharing more similarity. The reason for this could be that NSGA-II and GDE3 employ ranked fronts based on crowding distance as the selection mechanism for creating new populations of eHHs, affecting so the preference for such heuristics. According to Table 2, the single heuristics 5, 13, 17, 21, 29, 33 and 37 are based on the BLI placement sub-heuristic, while heuristics 8, 32 and 40 are based on the CAD placement sub-heuristic. This shows that BLI, despite of its simplicity, is very effective, whilst CAD introduces a good semi complex 22 Juan Carlos Gomez, Hugo Terashima-Mar´ın eHH-NSGAII eHH-SPEA2 6000 7500 Exp 1 Exp 1 Exp 2 Exp 2 Exp 3 5000 Exp 3 6000 Exp 4 Exp 4 Exp 5 4000 Exp 5 Frequency Frequency Exp 6 Exp 6 4500 Exp 7 Exp 7 Exp 8 3000 Exp 8 3000 2000 1500 1000 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Single Heuristic Single Heuristic eHH-GDE3 7500 Exp 1 Exp 2 6000 Exp 3 Exp 4 Exp 5 Frequency 4500 Exp 6 Exp 7 Exp 8 3000 1500 0 0 5 10 15 20 25 30 35 40 Single Heuristic Fig. 9 Frequency of usage of the single heuristics inside the eHHs that form the PASs of the eHH-MO models. strategy to accommodate the pieces. Additionally, single heuristics 8, 21, 29 and 32 appear with similar usage across models, indicating their relevance when building solutions for the defined bin packing problem. Heuristics that are not used or that present a limited usage, contribute with a performance that is dominated by the performance of combinations of the used heuristics. Table 8 shows the single heuristics that conform the PASs of the SH model. We observe that almost the same heuristics (5, 7, 12, 20, 24, 27, 28, 29, 31, 32, 36 and 39) appear across all the experiments. In addition, there is very little overlap with the heuristics that are commonly used by the eHH-MO models, with only heuristics 29 and 32 appearing with frequent usage on both the SH and the eHH-MO models. All the previous is a clear indication that indeed the eHH-MO models are exploiting the strength of the different single heuristic when solving problem instances, by being able to identify what single heuristics could perform better in combination, instead of using always the same one. In order to understand the effect of the most used heuristics across eHH- MO models on forming their PASs, we perform extra experiments by running all the eHH-MO models using only the heuristics 8, 21, 29 and 32. We run the same experiments with the same setup as before, summarizing the PASs using non-dominance and estimating the defined metrics. The reference set Evolutionary Hyper-Heuristics 23 Exp. Single heuristics Sols. 1 3, 5, 6, 7, 12, 20, 24, 27, 28, 29, 31, 32, 36, 39 14 2 3, 5, 7, 12, 20, 24, 27, 28, 29, 31, 32, 36, 39 13 3 5, 6, 7, 12, 20, 24, 27, 28, 29, 31, 32, 36, 39 13 4 5, 7, 12, 15, 20, 24, 27, 28, 29, 31, 32, 36, 39 13 5 3, 5, 6, 7, 12, 20, 24, 27, 28, 29, 30, 31, 32, 36, 39 15 6 3, 5, 7, 12, 20, 24, 27, 28,29,31,32,36,39 13 7 5, 7, 12, 15, 20, 24, 27, 28, 29, 31, 32, 36, 39 13 8 3, 5, 6, 7, 12, 20, 24, 27, 28, 29, 31, 32, 36, 39 14 Table 8 Single heuristics in the PAS of the SH model. GD HV ∆ Sols. eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- eHH- Exp. NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 NSGAII SPEA2 GDE3 1 0.001 0.000 0.001 0.649 0.651 0.646 1.002 1.012 1.043 50 67 40 2 0.000 0.000 0.001 0.678 0.680 0.676 0.981 1.012 1.014 56 65 37 3 0.002 0.001 0.003 0.647 0.646 0.645 0.951 0.945 0.983 49 66 35 4 0.003 0.002 0.005 0.617 0.619 0.614 0.955 0.937 0.967 50 70 36 5 0.002 0.001 0.002 0.644 0.645 0.641 0.998 0.967 1.068 53 81 39 6 0.002 0.001 0.003 0.637 0.637 0.636 1.017 0.982 1.037 51 60 33 7 0.003 0.002 0.004 0.630 0.630 0.628 1.021 0.949 0.998 59 65 30 8 0.001 0.000 0.000 0.662 0.665 0.656 1.030 0.991 1.101 54 66 38 Av. 0.002 0.001 0.002 0.646 0.647 0.643 0.994 0.974 1.026 53 68 36 Table 9 Metric values for the summarized PAFs and number of solutions in the PASs of all the eHH-MO models when using only the single heuristics 8, 21, 29 and 32. used to calculate the GD and ∆ metrics, was obtained by combining all the non-dominated points from the PAFs of the experiments using the whole set of single heuristics and the experiments using only heuristics 8, 21, 29 and 32. Table 9 shows the metric values and the number of solutions in each PAS of each eHH-MO model. If we compare the results of Table 9 with the ones in Tables 6 and 7 we observe that the performance of the eHH-MO models is almost not affected by removing the least used single heuristics. Indeed, the average metric values for the generational distance, ∆ and the number of solutions improve. It is thus clear the importance of single heuristics 8, 21, 29 and 32 for the eHH- MO models, and that the use of other heuristics is indeed dominated by the combined used of these heuristics. To illustrate the way an eHH works in comparison with a single heuristic, we present in Table 10 the summarized steps of the solution process, by the two methods, of a problem instance that contains 18 non-convex pieces. We chose as single heuristic the number 12 (one of the most used by the SH model), and a random eHH produced by the eHH-SPEA2 model. The table shows the step number in the solution process, and for that step: the heuristic applied, the remaining pieces to allocate, the number of bins used up to that moment, the normalized values for low level objectives 1 (space) and 2 (allocation time), and the values for the ten features that describe the state of the problem in that step. We observe how the values of the features are changing with every step and how the eHH applies different heuristics depending on the problem state. The eHH reaches a better value for equation 1, whilst both methods use 24 Juan Carlos Gomez, Hugo Terashima-Mar´ın Features Rem. Used Step Heur. Pzas. Bins z1 z2 1 2 3 4 5 6 7 8 9 10 Single Heuristic (A) 1 12 17 1 1.00 0.00 0.03 0.41 0.38 0.34 0.20 0.13 0.40 0.23 0.56 0.94 5 12 13 1 0.91 0.01 0.03 0.52 0.29 0.40 0.14 0.14 0.10 0.30 0.63 0.72 10 12 8 3 0.79 0.03 0.02 0.66 0.23 0.43 0.14 0.18 0.12 0.48 0.82 0.44 15 12 3 5 0.57 0.04 0.01 0.92 0.17 0.52 0.07 0.22 0.09 1.28 1.85 0.17 18 12 0 8 0.58 0.05 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 eHH (B) 1 32 17 1 0.41 0.00 0.03 0.34 0.27 0.32 0.16 0.12 0.59 0.15 0.27 0.94 5 32 13 4 0.54 0.02 0.03 0.23 0.13 0.27 0.13 0.10 0.68 0.00 0.24 0.72 10 8 8 6 0.47 0.03 0.02 0.11 0.04 0.26 0.18 0.07 0.72 0.00 0.16 0.44 15 8 3 6 0.31 0.05 0.01 0.01 0.00 0.15 0.23 0.08 0.07 0.00 0.01 0.17 18 8 0 6 0.27 0.05 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Table 10 Intermediate states for the solution of a problem instance by a single heuristic (A) and an eHH (B). the same total allocation time. In that case, the chosen eHH would dominate the single heuristic 12. Finally, regarding the differences in performance between the eHH-MO models, these are due to the different mechanisms the MOEAs use to evolve populations of eHHs. As seen on section 4, eHH-SPEA2 differs from the other two in the use of strength raw fitness plus density estimation to sort eHHs, which by the number of solutions produced by eHH-SPEA2 seems to have a more positive impact than using fronts sorted by non-dominance, as in eHH- NSGAII and eHH-GDE3. In addition, it seems that mutation and cross-over, used by eHH-SPEA2 and eHH-NSGAII, have a more positive impact than DE when working with the block representation of the eHHs. Nevertheless, the intention of the current paper is not to test individual MOEAs, but rather to test the robustness and flexibility of the framework, which, as we have seen, in despite of the differences in strategies, all of the eHH-MO models produce good results that outperform the ones of the SH model. 7 Conclusions and Future Work In this paper we have presented a two-level framework, called multi-objective evolutionary hyper-heuristics (eHH-MO), for generalizing the solution process of a bi-objective 2D bin packing problem with convex and non-convex irregular pieces. The two low-level objectives consist on minimizing the waste of space inside the bins used to allocate the pieces, while simultaneously minimizing the required time to allocate all pieces inside the bins. The high level is a bi-objective evolutionary optimization process, that is solved using a MOEA. By implementing three different MOEAs we created three instantiations of our general eHH-MO framework: eHH-NSGAII, eHH-SPEA2 and eHH-GD3. These models produce a set of Pareto approximated eHHs (as set of rules) that solve problem instances in different ways. The results indicate that the eHH-MO models are well suited to build sets of rules that are able to solve the defined bi-objective bin packing problem in Evolutionary Hyper-Heuristics 25 a large variety of manners. We have compared the results of several runs of the eHH-MO models against the ones obtained by a baseline set of single heuristics (SH model) under eight experiments using the GD, HV and ∆ metrics to measure convergence, diversity, distribution and spread of solutions in the Pareto-approximated set (PAS) of each model. We have seen that the eHH- MO models significantly outperform the SH model in the different metrics in several experiments. When we summarized the PASs of all the models per experiment we have observed that the eHH-MO models outperform the SH model in the number of solutions and in several metric values. The eHH-MO models produce summarized PASs that contain more than twice the number of solutions in the PASs of SH, with the solutions of the eHH-MO models covering more space of the high-level objective functions, and clearly dominating the solutions in the PASs of the SH model. When comparing the eHH-MO models among them, we have observed that eHH-SPEA2 produces more solutions than the other two models. This means that the use of strength raw fitness plus density estimation to sort eHHs, and the use of mutation and crossover for generating new eHHs using the block representation, have a more positive impact than using fronts sorted by non-dominance or the use of differential evolution. Nevertheless, we have assessed the robustness and flexibility of our eHH-MO framework, since in despite of the different strategies used by the different MOEAs, it is able to produce better results than the SH model. In addition, it was established that not all the single heuristics contribute to the formation of the PASs in the eHH-MO models, in particular single heuristics 8, 21, 29 and 32 proved to be sufficient to produce similar results in terms of metric values and number of solutions than when using the whole set of single heuristics. This means that the behavior of other single heuristics are dominated by the combined used of these particular heuristics. The present work opens up several interesting research directions. An idea to explore is to solve bin packing problems that contain irregular (and reg- ular) pieces in 3D. This type of problems have naturally a higher degree of complexity than working with two-dimensional pieces, and we think that us- ing eHHs is a good approach to solve them. A second topic to consider is to include more objectives to the problem. Examples of other objectives could be optimize the total weight (adding a weight feature for each piece) inside a bin, or the balance based on the weight inside the bin, or the number of pieces per bin, or the heterogeneousness inside the bin, etc. Adding more objectives also add more complexity in the search space, and requires more time to find a good Pareto-approximated set of solutions. Additionally, experiments includ- ing other MOEAs and other ways to create new populations, could help to understand more the generality of the framework and the behavior of specific strategies to evolve populations of eHHs using the defined block representa- tion. Regarding the eHHs, an interesting topic is to explore the use of different features to represent the state of a given problem instance. An example of other features could be the geometrical properties of the free space inside the bins. Another issue to explore is to find a different mechanism to deter- mine which action to take given the instance state instead of the Euclidean 26 Juan Carlos Gomez, Hugo Terashima-Mar´ın distance. Research about similarity or distance metrics would be interesting, as well as the possibility of learning such metrics automatically. It could be also interesting to compare the approach against other ways of doing multi- objective hyper-heuristics such as the choice-function, genetic programming or even other metaheuristic methods. Acknowledgements This research was supported in part by CONACyT Basic Science Projects under grants 99695 and 241461, ITESM Research Group with Strategic Focus in intelligent Systems, and University of Guanajuato Campus Irapuato-Salamanca. References 1. de Armas, J., Miranda, G., Le´ on, C.: Hyperheuristic encoding scheme for multi-objective guillotine cutting problems. In: GECCO, pp. 1683–1690 (2011). DOI 10.1145/2001576. 2001803 2. Bai, R., Woensel, T.V., Kendall, G., Burke, E.K.: A new model and a hyper-heuristic approach for two-dimensional shelf space allocation. 4OR 11(1), 31–55 (2013) 3. Boominathanperumal, Rishinhaldar, Rajkumar, S.: Bin packing problems: Comparative analysis of heuristic techniques for different dimensions. International Journal of Pharmacy and Technology 8(2), 13,305–13,319 (2016) 4. Burke, E.K., Gendreau, M., Hyde, M., Kendall, G., Ochoa, G., zcan, E., Qu, R.: Hyper- heuristics: a survey of the state of the art. Journal of the Operational Research Society 64(12), 1695–1724 (2013). DOI 10.1057/jors.2013.71 5. Burke, E.K., Hart, E., Kendall, G., Newall, J., Ross, P., Schulenburg, S.: Hyper-heuristics: An emerging direction in modern research technology. In: Handbook of Metaheuristics, pp. 457–474. Kluwer Academic Publishers (2003). DOI 10.1007/0-306-48056-5 16 6. Burke, E.K., Hyde, M., Kendall, G., Woodword, J.: A genetic programming hyper- heuristic approach for evolving 2-d strip packing heuristics. IEEE Transactions on Evo- lutionary Computation 14(6), 942–958 (2010) 7. Burke, E.K., Hyde, M.R., Kendall, G., Ochoa, G., Ozcan, ¨ E., Woodward, J.: A Clas- sification of Hyper-heuristic Approaches, International Series in Operations Research & Management Science, vol. 146, pp. 449–468. Springer US (2010). DOI 10.1007/ 978-1-4419-1665-5\ 15 8. Burke, E.K., Hyde, M.R., Kendall, G., Woodward, J.: Automating the packing heuristic design process with genetic programming. Evol. Comput. 20(1), 63–89 (2012). DOI 10.1162/EVCO a 00044. URL http://dx.doi.org/10.1162/EVCO_a_00044 9. Burke, E.K., Silva, J.D.L., Soubeiga, E.: Multi-Objective Hyper-Heuristic Approaches for Space Allocation and Timetabling, Operations Research/Computer Science Interfaces Series, vol. 32, chap. 6, pp. 129–158. Springer-Verlag (2005). DOI 10.1007/0-387-25383-1\ 6 10. Coello, C.A., Van Veldhuizen, D.A., Lamont, G.B. (eds.): Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd edn. Springer Verlag New York, Syracuse, USA (2007) 11. Crispin, A., Clay, P., Taylor, G., Bayes, T., Reedman, D.: Genetic algorithms applied to leather lay plan material utilization. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture 217(12), 1753–1756 (2003). DOI 10.1243/095440503772680677 12. Deb, K., Pratap, A., Agrawal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002) 13. Dowsland, K.A., Dowsland, W.B.: Solution approaches to irregular nesting prob- lems. European Journal of Operational Research 84(3), 506–521 (1995). DOI 10.1016/ 0377-2217(95)00019-M 14. Dyckhoff, H.: A typology of cutting and packing problems. European Journal of Oper- ational Research 44(2), 145–159 (1990). DOI 10.1016/0377-2217(90)90350-K Evolutionary Hyper-Heuristics 27 15. Fonseca, C., Fleming, P.: Multiobjective optimization and multiple constraint handling in evolutionary algorithms. IEEE Transactions on Man, Systems and Cybernetics, Part A: Systemas and Humans 28(1), 26–37 (1998) 16. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979) 17. Gomez, J.C., Terashima-Mar´ın, H.: Approximating Multi-Objective Hyper-Heuristics for Solving 2D Irregular Cutting Stock Problems, Lecture Notes in Computer Sci- ence, vol. 6438, chap. 30, pp. 349–360. Springer Berlin Heidelberg (2010). DOI 10.1007/978-3-642-16773-7\ 30 18. Gomez, J.C., Terashima-Mar´ın, H.: Building general hyper-heuristics for multi-objective cutting stock problems. Computaci´ on y Sistemas 16(3), 321–334 (2012) 19. Goodman, E.D., Tetelbaum, A.Y., Kureichik, V.M.: A genetic algorithm approach to compaction, bin packing, and nesting problems. Tech. Rep. 940702, Case Center for Computer-Aided Engineering and Manufacturing, Michigan State University (1994) 20. Hu-yao, L., Yuan-jun, H.: NFP-based nesting algorithm for irregular shapes. In: Sym- posium on Applied Computing, pp. 963–967. ACM Press, New York, NY, USA (2006). DOI 10.1145/1141277.1141507 21. Jiang, S., Ong, Y.S., Zhang, J., Feng, L.: Consistencies and contradictions of perfor- mance metrics in multiobjective optimization. IEEE Trans. Cybern. 44(12), 2391–2404 (2014). DOI 10.1109/TCYB.2014.2307319 22. Kukkonen, S., Lampinen, J.: GDE3: the third evolution step of generalized differential evolution. In: IEEE Congress on Evolutionary Computation, pp. 443–450. IEEE (2005). DOI 10.1109/CEC.2005.1554717 23. Kumari, A.C., Srinivas, K., Gupta, M.: Software module clustering using a hyper- heuristic based multi-objective genetic algorithm. In: IEEE 3rd International Advance Computing Conference (IACC), pp. 813–818 (2013) 24. Li, Y.L., Zhan, Z.H., Gong, Y.J., Chen, W.N., Zhang, J., Li, Y.: Differential evolution with an evolution path: a deep evolutionary algorithm. IEEE Trans. Cybern. 45(9), 1798– 1810 (2015). DOI 10.1109/TCYB.2014.2360752 25. Lodi, A., Martello, S., Monaci, M.: Two-dimensional packing problems: A survey. Eu- ropean journal of operational research 141(2), 241–252 (2002) 26. L´opez-Camacho, E.: An evolutionary framework for producing hyper-heuristics for solv- ing the 2D irregular bin packing problem. Ph.D. thesis, Tecnol´ ogico de Monterrey (2012) 27. L´opez-Camacho, E., Ochoa, G., Terashima-Mar´ın, H., Burke, E.K.: An effective heuristic for the two-dimensional irregular bin packing problem. Annals of Operations Research 206(1), 241–264 (2013). DOI 10.1007/s10479-013-1341-4 28. L´opez-Camacho, E., Terashima-Mar´ın, H., Ochoa, G., Conant-Pablos, S.E.: Understand- ing the structure of bin packing problems through principal component analysis. Inter- national Journal of Production Economics. Special Issue on Cutting and Packing. pp. 488–499 (2013). DOI 10.1016/j.ijpe.2013.04.041 29. L´opez-Camacho, E., Terashima-Mar´ın, H., Ross, P.: Defining a problem-state represen- tation with data mining within a hyper-heuristic model which solves 2D irregular bin packing problems. In: Advances in Artificial Intelligence IBERAMIA, Lecture notes in computer science, vol. 6433, pp. 204–213 (2010). DOI 10.1007/978-3-642-16952-6 21 30. L´opez-Camacho, E., Terashima-Marin, H., Ross, P., Ochoa, G.: A unified hyper-heuristic framework for solving bin packing problems. Expert Systems with Applications 41(15), 6876–6889 (2014). DOI 10.1016/j.eswa.2014.04.043 ¨ 31. Maashi, M., Ozcan, E., Kendall, G.: A multi-objective hyper-heuristic based on choice function. Expert Systems with Applications 41(9), 4475–4493 (2014). DOI 10.1016/j. eswa.2013.12.050 32. Martinez-Sykora, A., Alvarez-Valdes, R., Bennell, J.A., Ruiz, R., Tamarit, J.M.: Matheuristics for the irregular bin packing problem with free rotations. European Journal of Operational Society 258(2), 440–455 (2017) 33. Okano, H.: A scanline-based algorithm for the 2D free-form bin packing problem. Jour- nal of the Operations Research Society of Japan 45(2), 145–161 (2002) 34. Pappa, G.L., Ochoa, G., Hyde, M., Freitas, A.A., Woodward, J., Swan, J.: Con- trasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms. Genetic Programming and Evolvable Machines 15(1), 3–35 (2014). DOI 10.1007/ s10710-013-9186-9 28 Juan Carlos Gomez, Hugo Terashima-Mar´ın 35. Rafique, A.F.: Multiobjective hyper heuristic scheme for system design and optimiza- tion. In: 9TH International Conference on Mathematical Problems in Engineering, Aerospace and Scince, ICNPAA 2012, pp. 764–769 (2012). DOI 10.1063/1.4765574 36. Ren, Z., Jiang, H., Xuan, J., Hu, Y., Luo, Z.: New insights into diversification of hyper- heuristics. IEEE Trans. Cybern. 44(10), 1747–1761 (2014). DOI 10.1109/TCYB.2013. 2294185 37. Ross, P.: Hyper-heuristics. In: E.K. Burke, G. Kendall (eds.) Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques: Second Edition, pp. 611–638. Springer, New York (2014). DOI 10.1007/978-1-4614-6940-7 20 38. Sabar, N.R., Ayob, M., Kendall, G., Qu, R.: A dynamic multiarmed bandit-gene expres- sion programming hyper-heuristic for combinatorial optimization problems. IEEE Trans. Cybern. 45(2), 217–228 (2015). DOI 10.1109/TCYB.2014.2323936 39. Sim, K., Hart, E., Paechter, B.: A lifelong learning hyper-heuristic method for bin packing. Evolutionary Computation 23(1), 37–67 (2015) 40. Terashima-Mar´ın, H., Ross, P., Far´ıas-Z´ arate, C.J., L´opez-Camacho, E., Valenzuela- Rend´ on, M.: Generalized hyper-heuristics for solving 2D regular and irregular pack- ing problems. Annals of Operations Research 179(1), 369–392 (2010). DOI 10.1007/ s10479-008-0475-2 41. Van Veldhuizen, D.A., Lamont, G.B.: Multiobjective evolutionary algorithm test suites. In: Proceedings of the 1999 ACM symposium on Applied computing, pp. 351–357. ACM (1999). DOI 10.1145/298151.298382 42. V´azquez Rodr´ıguez, J., Petrovic, S., Salhi, A.: An investigation of hyper-heuristic search spaces. In: IEEE Congress on Evolutionary Computation, pp. 3776–3783 (2007). DOI 10.1109/CEC.2007.4424962 43. V´azquez-Rodr´ıguez, J.A., Petrovic, S.: A mixture experiments multi-objective hyper- heuristic. Journal of the Operational Research Society 64(11), 1664–1675 (2013). DOI 10.1057/jors.2012.125 44. Veerapen, N., Landa-Silva, D., Gandibleux, X.: Hyper-heuristic as component of a multi- objective metaheuristic. In: Proceedings of the Doctoral Symposium Engineering Stochas- tic Local Search Algorithms, no. TR/IRIDIA/2009-024 in IRIDIA, pp. 51–55 (2009) 45. W¨ ascher, G., Hausner, H., Schumann, H.: An improved typology of cutting and packing problems. European Journal of Operational Research. Special Issue on Cutting, Packing and Related Problems 183(3), 1109–1130 (2007). DOI 10.1016/j.ejor.2005.12.047 46. Xia, H., Zhuang, J., Yu, D.: Combining crowding estimation in objective and decision space with multiple selection and search strategies for multi-objective evolutionary opti- mization. IEEE Trans. Cybern. 44(3), 378–393 (2014). DOI 10.1109/TCYB.2013.2256418 47. Zitzler, E., Knzli, S.: Indicator-based selection on multiobjective search. PPSN, Lecture Notes in Computer Science 3242(1), 832–842 (2004) 48. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength pareto evolu- tionary algorithm for multiobjective optimization. In: Evolutionary Methods for Design, Optimisation and Control with Application to Industrial Problems. Proceedings of the EUROGEN2001 Conference, Athens, Greece, September 19-21, 2001, pp. 95–100 (2002) 49. Zitzler, E., Thiele, L.: Multiobjective optimization using evolutionary algorithms - A comparative case study, Lecture Notes in Computer Science, vol. 1498, chap. 29, pp. 292– 301. Springer Berlin Heidelberg (1998). DOI 10.1007/BFb0056872 50. Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999). DOI 10.1109/4235.797969