1. INTRODUCTION
The significant technological growth in today’s world and rapid changes in products and extreme competition in the production area have necessitated the use of highly efficient and flexible production systems more than before. In addition, organizations seek to increase their efficiency and optimize production operations. Under these circumstances, determining the sequence and schedule of work is a necessity for survival in the market. Moreover, it can optimize the use of available resources. In general, scheduling is assigning a set of tasks to a set of machines, and an operation sequence is a sequence of tasks that must be processed on a given machine (Pinedo, 2008). Different objective functions have been used in the literature to model such problems. While only one criterion has been considered as the main objective of most studies, several factors simultaneously affect the performance of a real production system. Therefore, multiobjective modeling allows managers to optimize their production process in terms of time and costs and increase their production efficiency. Machine scheduling is a rich and appropriate area of research, which has several uses in the production field. However, a parallel machine environment is one of the most important scheduling environments in which each task station encompasses a minimum of two identical machines. Parallel machines are able to perform the same operations with different talents and capacities, which increases the system’s flexibility and capacity. In this environment, there are many advantages to the use of scheduling, including a lower possibility of production planning cessation. This is mainly due to the fact that the use of only one machine in a task station will lead to the stopping of the production line in case of machine failure. Furthermore, considering more machines on the production line can decrease costs if scheduling on one machine leads to high costs. Increasing the number of machines can also reduce the level of earliness or tardiness. In these situations, the cost of purchasing or maintaining machines is added to the problem. However, by solving the optimization problem, we can determine what type of machines to use and which tasks will be done on each machine in what sequence.
Over the past decade, batch operation performance has attracted the attention of many industries toward batch processors with the goal of increasing the efficiency of production systems due to its fast pace and costeffectiveness. Therefore, production systems have sought batch order production. Since patch processors process a number of tasks in batches simultaneously, tasks must be assigned to categories before being processed on the machines, such that the total size of tasks in a batch is lower or equal to the machine’s capacity so that at least one batch is assigned to each machine. After this stage, baches are assigned to each machine in case of unemployment. In addition, a numerical capacity is defined for each batch, and since each task has a specific size determined based on the number of orders of a product by a customer, the total size of all tasks assigned to each batch should not exceed the capacity of that batch. After processing, each batch leaves the station and is replaced by the next batch. Each batch has a start time at which all tasks start, and after processing the batch, all completed tasks are removed from the batch at the same time. Therefore, batch makespan and exit of tasks from the batch is a single time, batch discontinuation is not possible throughout the processing period, and no task can be added or removed from the batch during this time. As such, the present study focuses on scheduling and sequencing tasks on machines. Our findings could play a significant role in improving the performance of production systems.
2. LITERATURE REVIEW
Most studies conducted in the field of batch processor parallel machines have considered identical parallel machines. In addition, maximum makespan and maximum delayed work have been the most studied objective functions. In this respect, Damodaran and VelezGallego (2010) applied heuristic approaches for makespan minimization on parallel batch processing machines. In another research (Damodaran et al., 2011), the mentioned scholars evaluated the minimization of production makespan by assuming nonzero task ready times, proposing the greedy randomized adaptive search procedure (GRASP) for solving the problem. Moreover, Cheng et al. (2013) presented a novel ant colony optimization approach, which is the main criterion to select the paths of ants to overcome the immature convergence. Kashan et al. (2008) offered a hybrid genetic heuristic (HGH) to minimize the production makespan. In addition, they compared the efficiency of their algorithm with simulated annealing. Sergeevna et al. (2021) presented a genetic algorithm (GA) with a new genomic encryption procedure for their desired problem. Damodaran and VélezGallego (2012) proposed a simulated annealing (SA) algorithm to minimize the production makespan. They applied random samples to compare the results of SA with a lower bound and modified delay (MD) heuristic and GRASP.
Reichelt and Mönch (2006) evaluated the objectives of minimizing the total weighted tardiness and production makespan. In this regard, they proposed a multiobjective GA to find a Pareto solution. In another study, Li et al. (2009) considered the simultaneous minimization of total weighted tardiness and production makespan, for which they suggested an ant colony optimization and used simulation tests to choose the algorithm’s parameters and demonstrate the efficiency of the proposed method. Pahlevan et al. (2021) proposed a number of heuristics to minimize the production makespan. In addition, they compared the efficiency of the heuristics with an SA and CPLEX. Chang et al. (2004) considered production makespan minimization, for which they proposed an SA. Mönch et al. (2006) evaluated the minimization of total weighted tardiness and proposed a heuristic based on apparent tardiness cost rules. In another research, Xu et al. (2013) assessed a biobjective problem of minimization of production makespan and maximization of tardiness, for which they proposed a metaheuristic based on a multiobjective ant colony optimization framework to find Pareto solutions. Majumder and Laha (2018) evaluated the problem of makespan minimization by using a Cuckoo optimization algorithm. In another study, Zhou et al. (2018) considered makespan minimization and total electricity consumption cost as a factor affecting the environment in the presence of dynamic job arrivals. They proposed a multiobjective differential evolution algorithm to solve the model and compared the computational results with two other metaheuristics. Few studies have been conducted on unrelated parallel batch processes, most of which have used a heuristic to minimize an objective. Li et al. (2013) evaluated the minimization of makespan for unrelated machines and proposed a heuristic based on the firstfit and bestfit earliest jobready time rules, where a lower bound was considered for assessing the quality of solutions.
Damodaran et al. (2012) presented a particle swarm optimization (PSO) for nonidentical machines to minimize the production makespan. In the end, the algorithm’s results were compared to a GA and CPLEX. Kim et al. (2003) proposed four heuristics of the earliest weighted due date, the shortest weighted processing time, the twolevel batch scheduling heuristic, and the SA method for unrelated machines with the goal of minimizing the total weighted tardiness. Hulett and Damodaran (2015) offered a PSO for the problem of minimizing the total weighted tardiness. In another research, Suhaimi et al. (2016) considered makespan minimization using the Lagrangian approach. In the end, they compared the computational results with results obtained from metaheuristic approaches applied in previous studies. Jiang et al. (2017) evaluated the production makespan minimization problem for identical machines by considering batch transportation, for which they proposed a hybrid DPSOGA algorithm. Hulett and Damodaran (2017) solved the problem of minimizing the total weighted tardiness using the artificial bee colony (ABC) algorithm and compared the results with PSO results. In another study, Hulett et al. (2017) solved the problem of minimizing total weighted tardiness of tasks by using PSO and proposed a heuristic for allocating tasks to batches and assigning batches to machines. Moreover, Zhou et al. (2018) evaluated the minimization of production makespan for unrelated machines by considering arbitrary job sizes and clearance time. They proposed a GAbased heuristic for the lower bound of the problem.
According to the literature review presented above, no research has been conducted to evaluate the problem of unrelated parallel batch machines by simultaneous minimization of several objectives and solving them with a multiobjective metaheuristic harmony algorithm. Therefore, the present study designs and develops a model for simultaneous minimization of production makespan and machine purchasing costs and tardiness penalties, and earliness costs of tasks in unrelated parallel batch machines. Ultimately, a metaheuristic based on the harmony search (HS) algorithm is proposed.
3. MATERIALS AND METHODS
As mentioned, batch processors process tasks in batches. The processing time of batches can be either fixed or variable; in problems with variable batch processing time, the processing time depends on the tasks that form the batches. Meanwhile, in problems with fixed batch processing time, the processing time is independent of the tasks existing in the batch and has a fixed value (Damodaran et al., 2009). In the present research, we consider the batch processing time variable. Since each task has the desired processing time, the longest processing time of the tasks existing in the batch will determine the batch processing time if the batch processing time is variable. Each task has an access time to be placed in a batch and a time of processing initiation on the machine, which plays a determining role in the start time of processing each batch. In other words, the task that reaches the batch at the latest time and has the longest accessibility time determines the batch processing start time. Each batch needs to be prepared before being processed on a machine, which is known as the batch preparation time. In this regard, the longest task preparation time in each batch determines the preparation time of that batch.
Assume that at the beginning of the scheduling period, there are n tasks with characteristics of Pj processing time and rj accessibility time, and Sj task size, which is equal to the amount of customer order for the jth task. The tasks enter the station based on their accessibility time and are classified in b batches that have S equal task unit capacity in the form of a batch before being processed on the machines. Notably, the tasks are assigned to machines such that the total number of tasks in a batch is lower than or equal to the machine capacity. In this station, there are m parallel machines, the capacities of which are different and equal to S_{i}. The processors have different capacities and speeds due to their irrelevance. Therefore, the preparation and processing time of batches on various machines will be different; for example, the batch processing and preparation time on the ith machine and the kth batch will be equal to P_{ik} and R_{Tik}, respectively. Therefore, each functional scale encom passes scheduling issues such as maximization of makespan, task earliness, task tardiness, and other functional scales by using the formed batches and their processing time is calculated. For instance, the maximum makespan in these problems is the maximum makespan of the last batch on the last machine. After completing the tasks in batches, the makespan of a task will be equal to the makespan of the batch that includes the desired task. In other words, the delivery time of a task is compared to the makespan of the batch in order to estimate the earliness of the task. Accordingly, it is aimed to determine the optimal batch shaping values, assign batches to machines, and determine the optimal batch sequence on machines in order to obtain the Pareto solutions of two objectives of minimizing the makespan of the last batch on the last machine and minimizing the task tardiness and earliness costs and machine purchasing costs.
3.1 Premises

The time of processing, preparation and accessibility of tasks are definite and independent of scheduling and sequencing of the tasks.

Batch discontinuation during processing is not possible (discontinuation is not allowed).

All machines are batch processors, and each machine can process only one batch at every moment.

Each batch is processed on only one of the parallel machines and then exits the station.

All machines are accessible at the start of the scheduling time.

There is a minimum of two and a maximum of a certain number of machines.

Tasks cannot be eliminated to be added to a batch during processing.

The total number of tasks assigned to a processor is not greater than the machine’s capacity.

Simultaneous failure of machines is not possible.

There is no priority regarding batch processing.

Each machine is continuously accessible throughout the scheduling period and will not be idle as long as there is a task.

The size of the assumption of lack of any tasks is larger than the capacity of the larger batch.

Tasks have different earliness and tardiness penalties. Overall, the tardiness penalty is larger than the earliness penalty.

The initiation time of each batch is not considered separately and is part of the batch processing time.
3.2 Mathematical Model
In this article, we consider the problem of machine scheduling with two objectives of minimizing the production makespan and minimizing machine purchasing costs and earliness and tardiness penalties simultaneously. Given the significant role of reduction of these costs in the improvement of corporate performance, a multiobjective nonlinear mixedinteger algorithm is proposed by considering preparation times based on the models proposed by Xu et al. (2013) and Li et al. (2013). We also evaluate the optimal allocation of batches to parallel machines in an unrelated parallel machine with m batch processors. Considering the mentioned issues, this is the first article that proposes a model in the field of batch processor machine scheduling to simultaneously minimize four functional scales in a parallelmachine station with m machines (m≥2). In this section, we present the model and its parameters and the problem variables:
Indices:
Parameter:

n : Number of tasks

m : Number of machines

k : Zumber of batches

S : Maximum capacity of batches

Pj : Processing time of the jth task

Sj : Size of the jth task

dj : Due date of the jth task

θj : The coefficient of the cost of each increase in the earliness of the jth task per unit

βj : Coefficient of the amount of the penalty for each amount of increase in the tardiness of the jth task per unit

Rj : The time when the jth task becomes accessible and the processing operation can be initiated at this moment

RTj : Time required to prepare the jth task

γi : The cost of purchasing each ith machine

Si : Maximum capacity of the jth machine

B : Maximum budget for purchasing all machines
Variables:

P_{ik} : A big number

St_{ik} :The processing time of the kth batch on the ith machine

Rt_{ik} : The processing start time of the kth batch on the ith machine

C_{ik} : The preparation time of the jth batch on the ith machine

T_{j} : The makespan of the kth batch on the ith machine

E_{j} : The amount of increase in the tardiness of the jth task

C_{max} : The amount of increase in the earliness of the jth task

x_{ijk} : Maximum makespan of tasks

y_{ik} : The binary decision variable; 1, if the jth task s allocated and processed on the ith machine; otherwise, 0.

L_{i} : The binary decision variable; 1, if the kth batch is processed in the ith machine; otherwise, 0.
Binary decision variable; 1, if the ith machine is purchased; otherwise. 0.
The first objective function is the minimization of the maximum makespan of all tasks, which is expressed in Equation 1. Equation 2 shows the second objective function of the problem, which is minimizing the total tardiness penalty, and earliness costs and machine purchasing costs. Equation 3 guarantees that each task is only assigned to one batch and the mentioned batch is only processed on one machine. Equation 4 ensures that each batch is allocated to only one machine and batch discontinuation is not possible. Equation 5 expresses that if a machine is selected, it must be assigned with at least one batch. Equation 6 shows that at least one task should be allocated to the batch that is allocated to a machine. Lack of allocation of a batch to a machine means that no task is assigned to the batch. Equation 7 expresses the machine capacity constraints, guaranteeing that the total tasks allocated to a match should not exceed the machine capacity. Equation 8 shows the processing time of the kth batch on the ith machine, which is equal to the longest processing time among all processing times of tasks in a batch. Equation 9 expresses the preparation time of the kth batch on the ith machine, which is equal to the longest preparation time among all preparation times of tasks in the batch. Equation 10 defines the start time of the kth batch on the ith machine, which is equal to the longest time of accessibility among all accessibility times of tasks in the batch or equal to the total makespan of the previous batch and preparation of the same batch on the same machine. Equation 11 shows the makespan of the kth batch on the ith machine, which is equal to the total start time of the kth batch and the processing time of the kth batch on the ith machine. Equation 12 guarantees that the total cost of all purchased machines is lower than the maximum budget. Equation 13 expresses the amount of tardiness of the jth task. On the other hand, the amount of earliness of the jth task is shown in Equation 14. Furthermore, Equation 15 guarantees that the total makespan of the tasks is equal to the longest makespan of various batches on all machines. Equation 16 expresses the maximum capacity constraint for all batches, according to which the total size of tasks in a batch should not exceed the capacity of the batch. Finally, Equation 17 determines whether the decision variables are zero or one.
4. PROBLEMSOLVING APPROACHES
It is assumed that p_{m}c_{max} is a classic type of parallel machine scheduling problem and of NPhard tyle (Qazani et al., 2019). On the other hand, $1\left\text{batch}\right{{\displaystyle \sum}}^{\text{}}{\text{WT}}_{\text{j}}+{{\displaystyle \sum}}^{\text{}}{\text{WE}}_{\text{j}},\hspace{0.17em}1\left\text{batch}\right\text{makespan}$ problems, which are the classic form of problems of batch production for a singlemachine environment, are of NPhard type (Pinedo, 2008). Since they are a simpler form of ${\text{p}}_{\text{m}}{\text{r}}_{\text{j}},{\text{s}}_{\text{j}},{\text{s}}_{\text{i}},{\text{p}}_{\text{ik}},\text{B},\text{p}\text{batch}\text{makespan},\text{cost}$, which is the problem considered in the present research, our problem is of NPhard type. Different approaches can be used to find an exact solution for smallscale problems. Largescale problems, however, cannot be solved by methods such as branch and bound and dynamic programming in polynomial time. A multiobjective HS algorithm is used in the present study due to the biobjective nature of the problem. Since the concept of dominance provides the basis for comparing solutions with multiple objectives, most multiobjective optimization methods use this concept to search for nondominated solutions. In multiobjective optimization, a set of quality optimal solutions is often obtained, which is called the Pareto optimal set, instead of obtaining a unique optimal solution. In a multiobjective problem, Pareto solutions are a set of nondominated points that dominate all other solutions (Deb et al., 2002). In some cases, Pareto solutions do not take precedence over another, and each one of them can be considered as an optimal solution depending on the situation. Epsilon limit is one of the methods used for accurate solving of multiobjective problems. In this approach, only one objective is optimized at each stage, and the rest of the objectives are considered as a higher bound constraint. This approach converts multiobjective optimization problems into singleobjective ones.
The following steps are taken in each epsilon limit:

Each time, one of the objective functions is selected as the main objective, based on which the problem is solved and the optimal value of the objective function is obtained.

The interval between two optimal values of secondary objective functions is divided into a predetermined amount, and a table of values is obtained for ${\epsilon}_{2},\dots {\epsilon}_{n}$.

Each time, the problem is solved with the main objective function with each of the ${\epsilon}_{2},\dots {\epsilon}_{n}$ values. By doing so, Pareto solutions are obtained (Tavakkoli Moghaddam et al., 2011).
4.1 HS Algorithm
Game et al. (2001) developed the HS algorithm, which mimics the behaviors of music players in an improvisation process. The algorithm is inspired by musical instruments and follows the improvisation of musicians. As a robust algorithm among the new metaheuristics, it suggests a relationship between proposing an optimal solution for the problem and an improvisation process. The qualitative improvisation process is converted into a quantitative optimization process in accordance with the mentioned metaheuristic.
4.2 Multiobjective HS Algorithm
Assume that k harmonies are created by n music players. Harmony memory size (HMS) is equal to k harmonies. Therefore, we consider a matrix with k rows (the number of harmonies memorized by music players) and n+1 columns including n number of music players (the number of variables affecting the problem) and one column for harmony value (f(x)) . The algorithm comprises three main stages, 1) initializing harmony memory and algorithm parameters, 2) creating a new harmony vector, and 3) updating the harmony memory. In the first stage, a primary generation of harmony vectors is created randomly and reserved in the harmony memory. The components of each solution vector must be in their respective range. The HS algorithm parameters are HMS (the number of solution vectors in memory), harmony memory consideration rate (HMCR), scale matching rate (tuning), bandwidth distance and a maximum number of improvements. In the second stage, a new harmony vector is created using three operators. Each new harmony can be formed of a change in the notes of previous harmonies – i.e., the music player slightly changes the notes. A music player can play his instrument in three ways: 1) playing some tune from memory, 2) playing some similar tunes from memory by adjusting slightly the pitches, and 3) playing random note. These three modes are actually the three qualitative operators of this algorithm, and the improvisation process is a combination of these three operators. Therefore, a new harmony vector can be created in one of three ways:

Considering the notes existing in the memory: if the value of the rand, which is a random value between one and zero of the uniform probability distribution function, is smaller than HMCR, the value of the ith component of the new solution will be randomly valued from the ith component of one of the solutions existing in the memory; otherwise, the value of the component is randomly assigned from the specified range of that component according to the third rule, which is the rule of random selection. This operator guarantees that the best harmonies will remain in the memory during the optimization process. In this regard, the operator can be controlled by a rate called the HMCR. Smaller rates lead to the slow convergence of the algorithm due to the low number of selected harmonies. On the other hand, larger rates mean that only the harmonies existing in the memory are selected and the algorithm falls into the trap of local optimization. The application of this operator is similar to that of the elitism operator in the genetic algorithm.

Scale matching rate (tuning): scale matching is changing the structure of frequencies, and its rate is used to control this operator. When the first rule is implemented, the second rule of the algorithm is applied, according to which a change will be made in the value of the new solution component if the value of rand is lower than PAR. The higher the value of this rate, the higher the diversity in the algorithm, which is associated with the algorithm acting as a random search method. Using this operator is equivalent to generating neighboring responses in optimization algorithms and is similar to the mutation discussion in the GA.

Random selection: this operator generates solutions randomly and enters them into the population.
The harmony memory is updated in the third stage. The new harmony memory, which was explained above, forms two HMS solution vectors (composite harmony memory) along with the existing harmony memory. Afterwards, the nondominated ordering and ranking of the vectors are carried out.
In the present study, we use the nondominated ordering and ranking proposed by Deb et al. (2002) to find optimal Pareto solutions. When all solution vectors are ranked in composite harmony memory, a degree of variability is assigned to the solution vectors, which uses swarm spacing for a nondominated group. Swarm distance expresses the distance of solution vectors surrounded by a particular solution vector. The swarm distance scale is the average distance between two solution vectors on either side of a solution vector based on each objective function. By doing so, the best harmony memory (at the size of HMS) is selected from the composite harmony memory, and the last nondominated batch is used to choose the HMS solution vector and filled by choosing the best solutions applying the swarm comparison operator (Sivasubramani and Swarup, 2011).
4.3 Proposed Algorithm Solution Presentation
Solutions are presented in the algorithm in the form of a twovector set, the first of which encompasses n components, where n is the number of tasks and the numbers inside each of their cells show their assigned batch. The second vector encompasses b components, where b is the number of batches and the numbers inside each of their cells is their assigned machine. Each component is obtained based on random values in the range of [0,1], which is converted to the permutation of tasks with the rule of the value of the largest position.
4.4 Experiment Design and Taguchi Parameter Tuning
Experiment design is a powerful statistical technique to improve quality and reduce cost in process or product designs. The technique includes a set of experiments that purposefully cause changes in the input variables of the process so that it is possible to observe and identify probable changes in the output solution (Montgomery, 2001). The Taguchi method is more applied in optimization problems than other experiment design techniques. This method was developed by Dr. Genichi Taguchi, a Japanese qualitative experimenter in the field of experimental design in the 1940s. The method involves using orthogonal arrays and fractional factorial designs to generate experiments. Although the results are not better than the results obtained from the complete factorial method, the advantage of this method is that by performing a smaller number of experiments, the best combination of parameter values in the problem is obtained. Taguchi's philosophy is based on reducing the variability of the value of the objective function (which characterizes the performance of each product or process). Taguchi modeled possible deviations from this target value with the loss function, in which the loss is the cost of differentiating the quality characteristics of the product from the nominal characteristics and is borne by the consumer. Taguchi considers the quadratic loss function as Equation (21).
Changes are analyzed using signaltonoise ratios, which is the response variable in this method. In each test run, the value of the target function is converted to a signal tonoise ratio according to the Taguchi method, which is derived from the quadratic function of the loss. Its three standard cases are widely used, and one type of each is applied depending on the requirements of each research. Since in this paper, the purpose of each run is to minimize the objective functions, the signal type to the better smaller noise is selected as the objective function type according to Equation (22).
The Taguchi experiment design steps include:

1) The evaluated qualitative factors are determined.

2) The number of levels in each factor is determined.

3) The proper array is selected.

4) Candidate factor levels are assigned to the columns of the array.

5) Experiments are performed based on the array.

6) The experiment results are turned into S/N and the optimal level of each factor is found.

7) Experiments are carried out based on the optimal factors.
There are four factors for the MOHS algorithm and three levels for each. With a complete factorial scheme, 34×10×2 (equivalent to 1620) experiments are required to solve the model. Therefore, the implementation of this experiment design approach is not costeffective. Meanwhile, Taguchi standardized orthogonal array designs will effectively save time and money with a much smaller number of experiments. In the first step, the number of degrees of freedom required is calculated in order to fit the appropriate orthogonal array. In this case, there is one degree of freedom for the total mean and two degrees of freedom for each threelevel factor. Therefore, the sum of the required degrees of freedom, which is equal to the minimum rows of the selected array, is (2 × 4) + 1 = 9. The L9 array is selected according to Taguchi orthogonal arrays. A disadvantage of the use of this array is the lack of matching the number of parameters with the desired algorithm. Therefore, the array can be matched to the existing problem by using matching techniques (Park, 1995). Since there are three threelayer factors in the algorithm, a column of the array must be eliminated. An important point in correcting an array is maintaining its orthogonality. The total number of algorithm execution is ranked by this method 9×20=180 ; meaning that, 1440 times (1801620) has been saved in time and money. Appropriate parameters are chosen for the algorithm and the best of them are selected based on this technique (Table 1).
4.5 Sample Problem Generation
In this article, 15 sample examples are generated with different numbers of tasks and various batch processors based on Table 2 and according to the sample examples presented by Damodaran et al. (2009). Since two types of processing times, task size and accessibility time (clearance) are considered, a total of 30 problems are created. In these samples, it is assumed that the size of batches (S) is equal to the fixed value of 10 and 20 in the firsttype and secondtype samples, respectively. Delivery times are also created based on Equation 15 proposed by Tavakkoli Moghaddam et al. (2007). In this equation, the P equation is obtained after estimating the mean time of processes (P).
Since machines do not directly deal with the number of tasks and the number of batches must replace the number of tasks in batch processes, the mean formable batches (B) replace the number of tasks to calculate p. Given that the capacity of the batches is completed according to the size of the task, the number of formable batches is determined by considering the task size and calculating the total task size. According to the experiments performed in this field, an average of 0.8 capacity of each batch is occupied among the formed batches. Therefore, the total task size is divided by 0.8 batch capacity to calculate the expected number of batches. By doing so, the expected number of batches replaces the number of tasks in Equation 24. The desired equation is shown in the second column of Table 3. Since the production environment used in the problem of Tavakkoli Moghaddam et al. (2007) is a workshop flow, it adapts to the environment of parallel machines by changing and correcting the mentioned equation. The desired equation is described in the third column of Table 3. After the calculations, the Equation of the fourth column of Table 3 is presented for creating delivery times. Moreover, the earliness costs, task tardiness penalties and purchasing cost of each itype machine are calculated in the form of a uniform distribution function for all sample problems based on Table 4.
4.6 Choosing the Best Factor in the Proposed Algorithm
In this algorithm, the criterion of the relative distance of consecutive unsuccessful solutions is considered as a criterion for comparing answers. The lower the value of this index, the more appropriate that level of the factor. The results of the algorithm for adjusting its parameters by Taguchi experiment design method, for small and large problems separately, are presented in Table 5 and the mean S/N ratio diagram drawn for solving the model in small dimensions is shown in Figure 1.
5. RESULTS AND DISCUSSION
In this section, the problem is solved with the Epsilon constraint approach in GAMS 23.5 software by creating five sample problems with small size and selecting a small number for existing tasks, machines, categories and other parameters, and the results are compared to the results obtained using the proposed metaheuristic algorithm. The results are shown in Table 6, where NOS indicates the number of nondominated solutions obtained and Time indicates the time elapsed to find Pareto solutions to the problem. Obviously, more Pareto solutions and less solution time express the advantage of each method. Fig ures 2 and Figure 3 show a comparison of these two indicators. According to the results, the time to solve the problem with the exact method significantly increases with the increase in the number of tasks and machines, such that no solution is obtained for problems with a size of 10×3, 25×5,15×4 or higher at times above 10000 seconds. Therefore, given the NPhard nature of the problem, the mentioned algorithm is proposed to solve the model at larger scales and much shorter times, which will save both time and money. The results obtained from Table (6) show the high quality of the performance of the multiobjective HS algorithm compared to the exact solution performed. Owing to its ability to find more solutions at much less solution time, the algorithm shows its remarkable efficiency in solving problems, especially on a large scale. To better compare the indices in this algorithm, two types of problem models are considered and the results are presented in Table 7. The two indices of NOS and Time are not sufficient to compare the results of the two types of problem samples. Therefore, two scattering indices of nondominated solutions are found for a more complete study of the obtained solutions and the relative distance of consecutive solutions is also used. The higher the dispersion index and the smaller the relative distance of consecutive solutions, the better the performance of that batch of sample problems. Figures 4 show a comparison of these four indices for the first and secondtype problems. According to Table 7, the multiobjective HS algorithm performs better in the Pareto solution number indices, the relative distance of consecutive solutions, and the execution time in most examples of the first type of problems. Meanwhile, better results are obtained in most of the second type of sample problems in the dispersion index. Therefore, the sample of the first type of problems is preferable to the sample of the second type of problems in obtaining better solutions.
According to Figures 3 and 4, there is a correlation between firsttype and secondtype problems. Regarding the dispersion of solutions, there is a growth in the dispersion of problems 5 and 13 and both types are able to determine the growth. However, regarding the relative distance index, firsttype values are lower than the secondtype values in most problems. Except for problems 4, 5, and 12, the secondtype method shows a larger relative distance in other problems.
6. CONCLUSION AND RECOMMENDATIONS
The present study proposed a comprehensive model for simultaneous minimization of two timerelated objective functions, including the production makespan, and a cost function, including the machine purchasing costs, earliness costs and task tardiness penalties. Based on previous studies, no research has been conducted to simultaneously assess these issues. In addition, we evaluated production scheduling problems in unrelated parallel batch machines, which is applicable to many industries. By considering more constraints, such as preparation times and unequal coefficients for earliness costs and tardiness penalties, attempts were made to bring the problem closer to the real world. According to the results, our problem was an Nphard polynomial problem, the solving of which through optimization software and accurate methods is almost impossible with the increased number of variables and machines. Therefore, the use of a metaheuristic could be beneficial. The presented model was solved by the accurate method in GAMS, and HS algorithm was solved. Less used for solving optimization problems, the HS algorithm has high efficiency in generating nondominated solutions and has a reasonable run time for largescale problems. Accordingly, the algorithm could be used as an efficient algorithm for solving such problems. Given its comprehensiveness, it can solve many similar problems since they are designed in a way that they can be turned into singlemachine and identical parallel (with equal speeds) and nonsimple machines. We applied the Taguchi experiment design method to tune the parameters. In the end, two types of sample problems were generated based on a research by Damodaran et al. (2009) due to the sensitivity of heuristics in valuing parameters. In addition, the results were compared in four indices. It is suggested that the cost of human resources allocated to machines and the cost of parts required for each machine be considered in the cost function. Moreover, it is recommended that the tardiness and earliness penalties be considered as a probability or a function of the level of tardiness and earliness, and exponential and normal distribution functions be considered instead of a uniform distribution function. Furthermore, processing times or task preparation times can be considered in probable or fuzzy form. We can also consider the permission of batch discontinuation in future studies and take the machine competency assumption, where each machine is able to process a certain batch, into consideration. It is also suggested that the number of types of machines (m) be regarded as a random variable in future studies.