• Editorial Board +
• For Contributors +
• Journal Search +
Journal Search Engine
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.18 No.3 pp.349-359
DOI : https://doi.org/10.7232/iems.2019.18.3.349

# Automatic Clustering and Multi-Directional Scheduling Scheme for Multi-Mode Scheduling Projects

Department of Mathematics, Payame Noor University (PNU), Tehran, Iran
December 18, 2018 May 25, 2019 July 15, 2019

## ABSTRACT

In this study, a cluster-based heuristic algorithm to solve multi-mode resource constrained project scheduling problem (MMRCPSP) is presented. The MMRCPSPs are usually challenging to schedule because each activity has multiple execution modes with renewable and non-renewable resources that are categorized as an NP-hard problem. In this paper, we propose and apply three ideas to solve an MMRCPSP. Firstly, an MMRCPSP with two objectives namely, makespan and executive cost is converted into a single mode resource-constrained project scheduling problem (RCPSP) with three objectives by relaxing the non-renewable resource constraints and then defining a corresponding penalty value. In the second phase, in the light of precedence and renewable resource constraints, the new RCPSP are scheduled by calling the multi-directional scheduling schemes (multi-dss). In the end, on the basis of three objectives’ value (namely makespan, cost, and penalty value), we defined a density-based fitness function for an evolutionary approach regarding the clustering methods on the new solution space. Our numerical results showed that the multi-dss with higher dimension regarding the automatic cluster-based fitness functions affected the performance of the evolutionary approach. Furthermore, the evolutionary approach based on multi-directional scheduling schemes prevents to trap into the local optimums by increasing the solutions’ diversity.

## 1. INTRODUCTION

In literature, there are different versions of the multi-mode resource constrained project scheduling problem (MMRCPSP) with their heuristic and meta-heuristic solving methods (e.g., see Kolisch and Drexl, 1997;Alcaraz et al., 2003;Damak et al., 2009;Yosefzadeh et al., 2010). From the real world-life point-of-view, we can mention many applications such as R&D projects, manufacturing, medical issues and also construction and etc. For example in 2012, a medical project that was carried out at the medical faculty of the University of Kiel (Germany) is considered by Hartmann by using the concept of Project scheduling (See for details Hartmann, 2013). He tried to find a relationship between cancer and polyamine synthesis. In Mejía et al. (2013) presents a real life Petri Net-based prototype for project management in the Animation and Videogame (A&V) industry in Colombia Mejía (2013). Such a Petri Net prototype is a module embedded in a larger computer application aimed for integrated and collaborative management of A&V projects.

In a MMRCPSP, each activity can be executed in one of several modes, and each mode having duration and also requiring some renewable and non-renewable re-sources. The precedence relations exist between activities. The traditional objective function is defined as minimizing the project duration (makespan). In Kolisch and Drexl (1997) showed that obtaining a feasible solution for a MMRCPSP with at least two non-renewable resources is NP-complete (Kolisch and Drexl, 1997). Furthermore, since each activity in the MMRCPSPs has multiple modes with different resource requirements then we can categorize these problems as the NP-hard problems due to the renewable and non-renewable resources.

In the literature review, one can find many studies regarding the MMRCPSPs with preemption or non-preemption case, single or multiple objectives such as makespan, cost, quality or tardiness criteria (See e.g., Tirkolaee et al., 2019;Muritiba et al., 2018;Jarboui et al., 2008;Xu and Zeng, 2015;Yoosefzadeh et al., 2014). In multi objectives case, considering all objectives simultaneously often conflict with each other. Hence, many researchers tried to find a way for creating trade-offs among conflicting objectives to produce a feasible schedules (e.g. see for more details Cheng et al., 2015;Leu and Yang (1999). In practice, obtaining an optimal solution of problems with multiple objective functions is complex. Furthermore, the relation between allocated amount of resources resource and activities’ duration can be affected the overall performance of scheduling schemes to make good solutions (Mejía et al., 2013). Hence, in this paper we try to identify a feasible trade-off among multiple objectives that it has a paramount importance in the light of Pareto environment to complete the projects.

Overviewing the literature confirmed that these methods can be modified in such a way that their performance significantly improve and hence their effectiveness increase. Specifically, among methods to solve multi objective optimization problems, especially project scheduling problems with multiple objectives, cluster-based methods can be useful to cope with the conflicting objectives. In this paper, we propose an automatic cluster-based evolutionary approach to solve the MMRCPSPs regarding the multi directional scheduling schemes. To do that, we suggested an evolutionary approach (with the same framework like as Elloumi and Fortemps (2010) but with different major parameters such as scheduling schemes, clustering methods and dimension of solution space) on the basis of three ideas to solve the MMRCPSPs. Firstly, a non-preemptive MMRCPSP with two objectives namely, total makespan and executive cost is converted into a single-mode resource constrained project scheduling problem (RCPSP) with three objectives. It is done by relaxing the non-renewable resource constraints and hence defining a corresponding penalty value. In the second phase, in the light of precedence and renewable resource constraints, the new RCPSP are scheduled by calling the multi-directional scheduling schemes (multi-dss). At the end, on the basis of three objectives’ value (i.e., total makespan, cost and penalty value), we defined a density-based fitness function for an evolutionary approach regarding the clustering methods on the new solution space. One of the advantages of using cluster-based methods in the proposed evolutionary approach is to create varieties in solutions generated by the genetic algorithm (GA). Furthermore, we can find some circumstances under which the scheduling schemes with higher dimensions produce the solutions with higher quality. The structure of the paper is organized as follows: in Section 2, we present a brief overview of the widely used clustering methods, such as the K-means (that is used in Elloumi, and Fortemps, 2010) and the complete linkage clustering methods, and then we apply two automatic clustering methods to reduce some drawbacks of mentioned clustering methods. We also briefly describe the resource allocation scheduling schemes with single and multi-dimensions in Section 3. The structure of evolutional algorithms is descripted in details in Section 4. In section 5, we present the related numerical examination and describe them in details. Finally, the summary of our conclusions are provided in Section 6.

## 2. CLUSTERING METHODS

Clustering is one of the unsupervised learning strategies and is an automatic process, in which the samples are divided into groups with similar members called clusters. Therefore, the main purpose of clustering is to organize patterns into meaningful and logical groups that allow discovering the similarity and differ-ences between patterns which leads to useful results. The cluster is a collection of data in which there are sim-ilar to each other in the cluster, and they are dissimilar to the member of other clusters. The clusters can be classi-fied as follow (for more details See Khoshnazar and Rezaei, 2013;Kriegel et al., 2012): (i) well-separated clustering (data in the cluster are more similar to those outside, i.e., inter-similarities); (ii) centroid-based clus-tering (the data points in a cluster due to its center are more similar to each other than to other cluster centers, i.e., outer-similarities); (iii) distribution-based clustering (similarity between some data points in clusters is great-er than to the other data points); (iv) density-based clus-tering where the areas with higher density is defined as clusters. There are many variations of clustering methods and then a proper classification is necessary for choosing the appropriate method to achieve the results Khoshnazar and Rezaei (2013). Unfortunately, there is no best way find so far to discover the best clustering technique, and in most cases, it only relates to the knowledge of the researcher and the characteristic structure of the problems that is being used. In this study, attempt was made to introduce automatic clustering techniques such as DBSCAN and mean shift clustering to eliminate some of the drawbacks of traditional and widely used clustering methods, such as K-means clustering and complete linkage, which is used in Afshar (2013;Elloumi and Fortemps, 2010), had better performance than other clustering methods for solving the MMRCPCP and to improve the obtained results.

### 2.1 K-means Clustering

The K-means clustering method is a simple but is practical and is the basis of several other algorithms Alpaydin (2009). This method presents a simple way to classify a given data set to K separated groups with a certain pre-determined number of centers (K). Determining the initial location of these centers is important because their different positions can produce different clusters and thus affect the final output. Some studies show that it is best to keep the cluster centers as far from each other as possible (see e.g., Khoshnazar and Rezaei, 2013), but in practical, the selection of such centers is impossible due to the lack of access to the nature of data.

The general description of the K-means clustering method is as follow: after choosing cluster’s centers (K centers), each point of data set is allocated to a cluster in which it has the most similarity to the cluster’s center. By clustering all of the points and by computing new centers (the mean of each cluster's elements), all data points are reassigned to the nearest cluster’s center until there are no changes in the data centers is obtained. The following measure is used to calculate the amount of similarity (which is usually considered as distance measure) between N data points $( x i ( j ) ; i = 1 , ... , N )$ of K centers $( c j ; j = 1 , ... , K )$, in which the notation $‖ . ‖$ is the Euclidean norm and calculate the distance between the data points and the jth center:

$J = ∑ j = 1 K ∑ i = 1 N ‖ x i ( j ) − c j ‖ 2$
(1)

By calling the complete linkage, any point from a given data set is considered as a cluster. The nearest cluster is defined as finding the minimum distance between the data. In this clustering, the distance between two different clusters ci and cj, i.e., $D i s t C l u s ( c i , c j )$ is defined as follow (See e.g., King, 1967):

$D i s t C l u s ( c i , c j ) = max x ∈ c i , y ∈ c j d ( x , y )$

### 2.3 Mean Shift Technique

The algorithm of mean shift is a non-parametric feature-space clustering method presented by Fukunaga and Hostetler (1975). In contrast with some clustering algorithm such as the K-means algorithm, the mean shift algorithm does not require prior knowledge of the number of clusters and does not have any constraints regarding the shape of the clusters. The mean shift algorithm is a technique for seeking the mode (or maxima) points of a probability density function and so-called mode-seeking algorithm (Cheng, 1995). This technique is usually applied where discrete samples of the function are available. This technique has a repetitive structure, starts from an initial guess point, and converges to the maxima. Although this algorithm has been widely applied, in general, the convergence of this method is not proved Li et al. (2007). The mean shift algorithm is stated as follows:

• Step 1. Calculate the vector of mean shift m(x):

$m ( x ) = ∑ i = 1 n x i e x p ( − ‖ x − x i ‖ 2 h ) ∑ i = 1 n e x p ( − ‖ x − x i ‖ 2 h )$

• Step 2. Shifting the density estimation window using the following relationship:

$x i t + 1 = x i t + m ( x i t )$

Return to step 1 and then iterate above steps till the convergence criterion is met.

Note that the mean shift algorithm cannot cluster the data sets and is only used to identify the centers. Therefore, after determining the centers by the mean shift algorithm, we call the K-means algorithm for data clustering to accurately evaluate the results in (Afshar, 2013;Elloumi and Fortemps, 2010)

### 2.4 DBSCAN Clustering

In 1996, Martin Ester et al. proposed a density-based clustering that is named by the DBSCAN. ) This method is carried out for all points distributed in any various shape in the data set (including noise points and also outlier points) based on the density of neighbors. The superior of the DBSCAN concerning other clustering methods such as K-means is that it is not sensitive to the data distribution and can detect irregularities in the data as well (see Xu et al., 1998 for more details). Like other clustering methods, this algorithm requires a measure to find the adjacency of data points. For doing so, the Euclidean distance is calling as a measure similarity. To describe the algorithm more, one needs to know about the parameters ε and minimum number of neighboring points MinPoints (μ). Before we describe the algorithm of DBSCAN, it is necessary to state some related definitions:

1. If the distance between any point (such as x) and a given point (such as p) is less than ε then x is a neighbor of p.

2. A centroid point is a given point with μ neigh-bors.

Now, in Algorithms 1 and 2, two pseudo-codes associated with the DBSCAN algorithm are presented. At first, an unvisited point is selected randomly. Then, the neighborhood of this point is evaluated with the radius ε and is considered as a cluster if it has the minimum number of neighboring points (MinPoints). Otherwise, it is labeled as a noise point. Note that this point may be in the adjacency of others and so become part of the other cluster.

Algorithm 1: The pseudo-code related to the DBSCAN DBSCAN_Method (ε, Dset, MinPoints)

Cls=0

for each x in dataset Dset

ifx is met select new point

Lable the visited point x

if Cardinality of NieghborhoodPoints is less than MinPoints

Lable point x as NOISE point

else

Define the new cluster Cls

Cluster(x, MinPoints, NieghborhoodPoints, Cls, ε)

Algorithm 2: The pseudo-code related to the subroutine Cluster of the Algorithm 1

Cluster(x, MinPoints, NieghborhoodPoints, Cls,ε)

Add the point x to the Cls

for each y in the NieghborhoodPoints set

if point y is not met

Lable point y is as a visited point

is not less than MinPoints

NieghborhoodPoints = Join the

NieghborhoodPoints set

ify is not yet added to any cluster

Add the point y to the Cls

return all members in ε-neighborhood of x

## 3. RESOURCE ALLOCATION SCHEDULING SCHEME

A standard project scheduling problem can be de-fined as follows: “The assignment some jobs or activities to some machines (such that each of them cannot process more than one job at a time) aiming at optimizing a specified objective function. In general, serial and parallel scheduling scheme can be considered as two well-known heuristic scheduling generation scheme for solving the resource allocation problem (Klein, 2000;Kolisch and Drexl, 1997;Kolisch, 1995). Numerical results in many studies show that, on average, the parallel scheduling scheme outperforms the serial scheduling scheme for solving the resource allocation problem (King, 1967;Kochetov and Stolyar, 2003); Xu et al., 1998]. Hence, in the present paper, the parallel scheduling scheme is used as base for other scheduling scheme.

### 3.1 Parallel Scheduling Scheme

The parallel scheduling scheme (PSS) is made up of at most |J| steps (J is the set of activities), in each of which, some of eligible activities are selected for scheduling. The specific feature of this method is that each step n ($n ≤ | J |$) corresponds to the decision time tn.

The set of scheduled activities is partitioned as fol-lows (Kolisch, 1996):

• Put activities that are scheduled and completed at decision time tn into the set Cn.

• Scheduled activities but are active at decision time tn are added to the active set An.

• The decision set Dn includes eligible activities that are not yet scheduled at tn. In other words, the set of Dn includes all unscheduled activities at the current decision time tn which satisfy both the precedence and resources constraints. At each step, the partial schedules includes two complete and active sets. The decision time at each step equals the earliest finishing time of at least one of the activities of the active set in the previous step.

### 3.2 PSS in a Backward Manner

In this approach, we schedule the activities using the parallel scheduling scheme but in a reverse order. In other words, firstly, reverse the direction of edges in the project’s network and then call the PSS or any other scheduling scheme.

### 3.3 Bi-directional Scheduling Scheme

Bi-directional scheduling scheme (bidss) consists of two following steps Klein (2000):

Step 1: Generate partial schedules (initialization phase)

In the bidss, the activities are scheduled in a back-ward and forward direction. For doing that, the eligible activities are divided into three parts:

• Eligible forward activities: the set ESf(Tf) is a set of eligible activities like j that satisfy both the resource and the precedence constraints at the decision time Tf.

• Eligible backward activities: the set ESb(Tb) is a set of eligible activities like j that satisfy both the successor and the resource constraints at the decision time Tb.

• A set of activities that are eligible concerning both the precedence and the successor constraints, simultaneously and also the resources constraints. Hence, these types of activities can be scheduled in both the forward and the backward directions about their resource constraints. According to a pre-defined tie-breaking rule, the activities that are closer to the beginning of the project are scheduled in the forward direction, and those are closer to the end are scheduled in backward.

Step 2. Generate complete schedule (Interlinking phase)

Initialization phase partitioned all activities into two partial schedules namely; forward (PSf) and backward (PSb) schedules. At the interlinking phase, the activities of the PSb are left shifted to the activities of the PSf in a follow manner. First, we arrange the activities of the PSb in a list in the ascending order according to their starting times, and then shifted each activity according to the list toward the activities of the PSf at the feasible earliest time by considering both the precedence and resources constraints.

### 3.4 Tri-Directional Scheduling Scheme

Tri-directional scheduling scheme (trdss) is the same as the bidss, but the trdss the activities are first scheduled in three directions, i.e., the forward schedule (FS), the backward schedule (BS) and the midway schedule (MS). Then these three partial schedules are combined at the interlinking phase and generate the final feasible schedule. The main idea of the trdss is to provide for more activities to be shifted to the left (i.e., to the FS), so that it is allows filling the gaps which maybe occur in the final feasible schedule as much as possible. The main difference between the trdss and the bidss is that in the trdss if the activities are eligible for scheduling in both dictions, they will be scheduled in the midway manner Yosefzadeh et al. (2010).

## 4. PROBLEM STATEMENT

In 1999, project scheduling problems were studied based on the classification of a variety of priority rules, in which the execution modes of activities, and the number and type of required resources were considered (see e.g., Klein, 2000;Yosefzadeh et al., 2010). The MMRCPSPs consider some relationships between activities such as logical precedence constraint, execution modes, and types of non-renewable and renewable resources. This problem includes |J| activities in which logical control relationships, finish to start exist between activities. In this paper, the AON2) network is used.

This study tries to find an execution mode for each activity and corresponding starting and finishing times so that the final schedule is feasible due to both the precedence constraints and the resources constraints and optimizes the objective functions. In this paper, the makespan, the penalty value regarding the non-renewable resources and the project's cost are defined as three objective functions in which should be minimized.

### 4.1 Proposed Evolutionary Approach

In this section, firstly, we will outline the proposed cluster-based evolutionary approach (CEA) to solve the MMRCPSPs (See Elloumi and Fortemps, 2010 for more details), and then, under the CEA, and to improve the quality of the resulting solutions, the multi-directional scheduling scheme (multi-dss) and automatic clustering methods will be applied. The general structure of the CEA is expressed as follows Elloumi and Fortemps (2010):

Structure of the CEA

1. Create the population pop which contains the pop chromosomes.

2. Apply a local search method to each individual within the population.

3. Number the current generation set to zero.

4. Perform the following steps as long as the number of generations is less than gen and runtime is less than the CPU time constraint.

• 4.1 Increase the number of generations

• 4.2 Create offspring

• 4.2.1 For combining, select individuals with pop size from the population set.

• 4.2.2 Using the crossover operator, generate a CHI set with size equal to the pop size for each individual pair of the population.

• 4.2.3 Apply the mutation operator on CHI with Pmut probability.

• 4.3 Calculate the fitness values

• 4.3.1 Calculate the project’s makespan (Time), the penalty values (Pl) and the cost (Cst) regarding the mode lists.

• 4.3.2 Apply the rank-based fitness assignment method on the basis of three criteria: i.e. Time and corresponding Pl and Cst.

• 4.3.3 Apply the clustering method to calculate the density

• 4.3.4 Select pop individuals from the popCHI set.

5. Return the best solution

### 4.2 Algorithm Description

The main contribution of the CEA contains the GA, scheduling scheme (such as the PSS, the bidss, and the trdss) and clustering methods. Each chromosome in the GA is presented by a row vector (α, β), where α and β are the activity list and the execution mode list ($| α | = | β | = N$) respectively. For the sake of brevity, we only explain the step 4.3 of the CEA in details.

Note that each solution in the search space is an or-dered triple (Time, Pl, Cst). The first two components of the solution are the project’s makespan and the penalties’ value respectively. The summation of costs due to mode list of activities is considered as the third one. In a population of the GA, each chromosome can be decodes to a solution (Time, Pl, Cst). Whenever a chromosome (α, β) is constructed, three components of solution are calculated immediately. Based on the activity list (α), the value of the first component (Time) is obtained by calling one of the heuristic scheduling schemes that are described in section 3. The second component Pl is obtained easily regarding the mode list as follows (see Kolisch and Drexl, 1997):

$P l = ∑ k ∈ N R L k ( β ) > 0 L k ( β ) R k N R$

where

$L k ( β ) = ∑ i = 1 N r j i m i k N R − R k N R$

and $R k N R$ is the availability of nonrenewable resource of type k, $r j i m i k N R$ is the non-renewable resource requirement of type k for activity ji in mode mi.

It is clear that if there is no violation regarding any kind of non-renewable resources types then Pl = 0.

And finally, the summation of costs regarding the mode list is considered as the third component (Cst). Hence, based on step 4.3.1, in each generation, we can decode any chromosome into a solution (Time, Pl, Cst).

In step 4.3.3, we use the clustering method (includ-ing automatic or traditional clustering methods) to calculate the density value for each solution (density is defined as total number of solutions in each cluster). The fitness value for each solution is calculated by:

$f ( s o l , t ) = 1 D e n s ( s o l , t ) × R a n k ( s o l , t )$

where Dens(sol, t) is the density of solution sol in the generation t and Rank(sol, t) is denoted as the rank of sol among all Pareto solutions of generation t regarding the non-dominated solution sense (step 4.3.2).

### 4.3 Solution Space

Since the existence of non-renewable resources im-press the complexity and NP-Completeness of the prob-lem Kochetov and Stolyar (2003), in this study, similar to the Fonseca and Fleming (1993) and the Elloumi and Fortemps (2010), a set of solutions to the MMRCPSP in the search space will be obtained by selecting an execu-tive mode randomly for each activity and then assigning corresponding non-renewable resources to each activity.

## 5. NUMERICAL RESULTS

In this section, the effect of amending the two main components of the CEA is studied (i.e., applying the multi-dss and calling the automatic clustering method) and hence the quality of Pareto solutions which are obtained by the CEA is evaluated. For doing that, we tested and investigated the performance of CEA on the standard benchmark’s problems of Kolisch and Sprecher (1996).

### 5.1 Clustering Methods for the CEA

In this section, in the main body of CEA and based on the experimental results of Afshar (2013) and Elloumi and Fortemps (2010), we study two types of clustering methods (such as complete linkage and K-means) which outperform the other clustering methods. Moreover, by considering the nature of solution space, we apply two new automatic clustering methods called the DBSCAN and the mean shift method.

In order to proper comparison to the results of Af-shar (2013) and also Elloumi and Fortemps (2010), we set the required parameters of the CEA like as in Afshar (2013) and Elloumi and Fortemps (2010). In other words, the experimental results are reported when the average number of CEA runs on 50 projects which each project is randomly selected from the hardest standard set for the MMRCPSP (i.e., J20 of Kolisch benchmark’s problems), in which the optimal solution for each sample of the J20 is known and available in Kolisch and Sprecher (1996). It is worth noting that according to the reported results in Afshar (2013), the GA will be called with the following parameters: population size pop=60, pop=120, the crossover probability pcrossover=1 and mutation probability pmut=0.9. Furthermore, in this paper, to more accurate comparison with the results presented in Afshar (2013), the number of generated schedules is defined as the stopping criteria for the GA and like in Afshar (2013) it equals to 1920 schedules with the step of 384. To better analyze the results of the CEA concerning the mentioned clustering methods and the multi-dss, we will use two criteria of average relative deviation from the optimal solution (relative deviation time) and the average relative deviation from the minimum project’s cost (relative deviation cost). These two criteria are defined for project p in relations (2) and (3) respectively:

$∑ p = 1 50 T p − o p t p o p t p 50 × 100$
(2)

where Tp is the completion time (or makespan) of p obtained from the CEA and optp is the optimal makespan of project p.

$∑ p = 1 50 C p − c o s t p c o s t p 50 × 100$
(3)

where Cp is the cost of project p obtained by the CEA, and costp is the optimal cost of project p (here the optimal cost is equal to the summation of the minimum costs of execution modes of each activity).

### 5.2 The CEA and Scheduling Schemes

Now, we investigate the effect of applying single and multi-dss (i.e., the PSS, the trdss and the bidss) in the CEA.

#### 5.2.1 Complete Linkage Clustering and Multi-Dss

As follows we report the examination results corre-sponding to apply the PSS, the bidss, and the trdss by regarding the complete linkage clustering method.

In Figures 1 and 2, the population size pop=60 and pop=120 are considered respectively. It is observed that according to the average relative deviation from the optimal makespan, as the number of iterations increases (i.e., the number of schedules which are generated) then the generated solutions converge to the optimal solutions (relative deviation decrease). Comparing three scheduling schemes, the trdss outperforms both of two others, i.e., the bidss and the PSS. Although the performance of the bidss is better than the PSS, this superiority decreases as the population size in the GA increases from pop=60 to pop=120. Such a conclusion can be derived from how to compare the dominant solutions in the Pareto solution space. By regarding the scheduling schemes, as the population size (pop) varies then the variety in produced costs are increased, and so we can produce different solutions. Hence, it can be affected by the dominance of the obtained solutions (Figure 3). In Figure 3, to investigate the dependency of the execution cost of each project on the scheduling scheme, we consider the average relative deviation from the lowest cost when the complete linkage clustering method is calling with the pop=60 and pop=120 separately. Numerical results show that by increasing the number of schedules (increasing the number of iterations), the project’s cost is reduced. Moreover, the comparison of the scheduling schemes shows that with small population size, performance of the bidss and the PSS are the same and even are better than the trdss. But, by increasing the population size, the quality of obtained solutions by the trdss is increased, and we can say that the performances of all the scheduling schemes are relatively the same.

According to Figures 1-3, we can say that if we apply the trdss with the pop=120, the quality of solutions is better than the other two scheduling schemes. Here, the quality means less deviation from the optimal project’s makespan and less deviation from the minimum project’s cost.

#### 5.2.2 K-means Clustering Method and Multi-Dss

In this section, similar to the complete linkage clus-tering, the performance of calling the PSS, the bidss, and the trdss in the CEA is evaluated when the K-means clustering method with pop=60 and also pop=120 is used. Figures 4 and 5 show the corresponding results due to three scheduling schemes in which the K-means clustering method and the GA with two populations size pop=60 and pop=120 are used.

Initially, these figures show that with less population size, the performance of the trdss is worse than other two scheduling schemes (Figure 4) but its solutions’ quality greatly will be improved if the population size increases to pop=120 (Figure 5).

Figure 6 shows the results of the project’s cost. By considering the measure of average relative deviation from the minimum cost, we conclude that as the population size increases to pop=120, there will be no significant difference in the three scheduling schemes and their performances are the same. Similarly, with an increase in the number of schedules (iterations), the cost of the projects is decreased.

By considering Figures 4-6, it can be observed that calling the trdss scheduling scheme, the K-means clustering method, and the population pop=120, the solutions obtained from the CEA are better than the other two scheduling schemes, i.e., the bidss and the PSS. In general, it is worth noting that we can say that the quality of the solutions obtained from the CEA is independent of the type of clustering method which is used.

#### 5.2.3 Mean Shift Method and Multi-Dss

As noted before, due to the nature of the automatic mean shift method, this method cannot alone cluster the data set and is just used as a complementary method for using other clustering methods. In other words, since the initial detection of clustering centers in the K-means method is crucial, so firstly, the mean shift method is utilized to determine the initial centers and then, to cluster the data set, the K-means clustering method will be called. By doing that, the effect of determining the position of the initial centers in the quality of the solutions can be evaluated. The results of the K-means based on the mean shift method as a complementary method are shown in Figures 7-9.

In Figures 7 and 9, measure of mean relative deviation from the optimal makespan, and in Figures 9, the measure of mean relative deviation from the minimum cost is used. According to Figures 7 and 8, it can be observed that disregarding the population size parameter, the trdss outperforms the PSS and the bidss. For the higher population size, the improvement is made more quickly than the lower one.

It is important to note that in the mean shift method, irrespective of the population size is used, the performance of three scheduling schemes is approximately the same, concerning the measure of average relative deviation from the optimal cost (Figure 9). Look at closer to Figure 9, with pop=120; average deviation from the minimum cost is less than pop=60, i.e., the solutions with pop=120 have higher quality than pop=60.

#### 5.2.4 DBSCAN Clustering Method and Multi-Dss

Like as pervious scenarios, in this section, the results of the CEA when the automatic DBSCAN clustering method is calling are investigated. Irrespective of population size, this clustering method, the superior of using the trdss concerning the other two scheduling schemes are evident (See Figures 10 and 11).

In Figures 10 and 11, like the previous clustering methods, with an increase in the population size from pop=60 to pop=120, the relative deviation from the optimal makespan decreases. Also, the average relative deviation decreases when the number of itraitions (schedules) increases. This indicates that the rate of improvement of the solutions’ quality which is obtained from the CEA. Figure. 12

It is worth noting that, there is important point in the application of the automatic clustering method. By considering the measure of average relative deviation from the minimum cost, regardless of using the type of scheduling schemes, the quality of generated solutions are the same. In other words, there is no significant difference between the performances of scheduling schemes for generating solutions with higher quality. It provides a situation which makes it easy for the project’s manager to decide on choosing the type of scheduling schemes. Of course, it should be noted that the use of larger population size is more effective in producing higher quality in generated solutions.

At the end, the experimental results are interpreted by using statistic methods in SPSS. One-way analysis of variance (ANOVA) has been used to compare the average relative deviation time and the average relative deviation cost in the light of the three scheduling schemes, i.e., the PSS, the bidss, and the trdss. According to the tabling report of ANOVA, regarding the “relative deviation time” variable, all of p-values are smaller than the significance level of α=0.05, that is, the three scheduling schemes are different in terms of this variable.

We also call the Duncan Test to detect significant differences. According to the results of this test, the average relative deviation times of the PSS more than the other ones and this difference is significant at the level of α=0.05, but between the average relative deviation time of bidss and trdss are not statistically significant.

In the case of relative deviation cost, the p-value in ANOVA test is not significant (the significance level is greater than α=0.05), hence there is no significant differ-ence between relative deviation cost. Duncan Test con-firms the results, too.

## 6. CONCLUSIONS

In this study, a heuristic algorithm, namely, cluster-based evolutionary approach (CEA) to solve multi-mode resource constrained project scheduling problem (MMRCPSP) is proposed. By so doing, we try to present a feasible trade-off among multiple objectives that it has a paramount im-portance in the light of Pareto environment to complete the projects. We also analyze the behavior of different clustering methods and the performance of multi-directional scheduling schemes (multi-dss). Different types of objective functions are used to evaluate different solution schemes. The first objective function we use is to shorten the project’s makespan, while the second one is minimizing the penalty function that is obtained by relaxing the non-renewable resource constraints. To analyze the results more accurately to the previous works, a cost function, as the value of the project implementation, is also evaluated as a third factor objective to reduce the cost of each type of project execution mode. Also, the effect of scheduling schemes on the performance of the CEA is studied by calling three different single- and multi-scheduling schemes, i.e., the PSS, the bidss and the trdss. We use the GA to solve these problems and assess the impact of the three scheduling schemes, the complete linkage, the K-means, the automatic clustering, i.e., the mean shift method and the DBSCAN clustering in two different cases with pop=60 and pop=120. Our numerical results from these evaluations indicate that regardless of population size (pop), the DBSCAN clustering method has less average relative deviation time than the other three clustering methods, which it improves with an increase from pop=60 to pop=120. Also, it is shown that by increasing the number of schedules (itraitions) in the GA, the solutions obtained by the CEA are convergent to the optimal solution irrespective of the clustering method which is used. Moreover, by increasing the population size, the quality of the solutions obtained from the automatic DBSCAN significantly is higher than the other three clustering methods, i.e., the complete linkage, the K-means and the mean shift method.

## Figure

Complete linkage and relative deviation time (pop=60).

Complete linkage and relative deviation time (pop=120).

Complete linkage and relative deviation cost (pop=60 to pop=120).

K-means and relative deviation time (pop=60).

K-means and relative deviation time (pop=120).

. K-means and relative deviation cost (pop=60 and pop=120).

Mean shift and relative deviation time (pop=60).

Mean shift and relative deviation time (pop=120).

Mean shift and relative deviation cost (pop=60 and pop=120).

DBSCAN and relative deviation time (pop=60).

DBSCAN and relative deviation time (pop=60).

DBSCAN and relative deviation cost (pop=60 and pop=120).

## REFERENCES

1. Afshar, F. (2013), Clustering-based hybrid evolutionary approachfor multi-mode resource constraints project scheduling problem, Thesis for master degree, Payame Noor University (in Persian).
2. Alcaraz, J. , Maroto, C. , and Ruiz, R. (2003), Solving the multi-mode resource-constrained project scheduling problem with genetic algorithms, Journal of Operation Research Society, 54(6), 614-626.
3. Alpaydin, E. (2009), Introduction to Machine Learning, The MIT Press, Cambridge, Massachusetts, MA.
4. Cheng, M. Y. , Tran, D. H. , and Cao, M. T. (2015), Chaotic initialized multiple objective differential evolution with adaptive mutation strategy (CA-MODE) for construction project time-cost-quality trade-off, Journal of Civil Engineering and Management, 22(2), 210-223.
5. Cheng, Y. (1995), Mean shift, mode-seeking, and clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790-799.
6. Damak, N. , Jarboui, B. , Siarry, P. , and Loukil, T. (2009), Diffierential evolution for solving multi-mode resourceconstrained project scheduling problems, Computers and Operations Research, 36(9), 2653-2659.
7. Elloumi, S. and Fortemps, P. (2010), A hybrid rank-based evolutionary algorithm applied to multi-mode resource-constrained project scheduling problem, European Journal of Operational Research, 205(1), 31-41.
8. Fonseca, C. M. and Fleming, P. J. (1993), Genetic algorithms for multi-objective optimization: formulation, discussion, and generalization, Genetic Algorithms: Proceedings of the Fifth International Conference, Morgan Kaufmann, San Mateo, CA, 416-423.
9. Fukunaga, K. and Hostetler, L. D. (1975), The estimation of the gradient of a density function, with application in pattern recognition, IEEE Transaction on Information Theory, 21(1), 32-40.
10. Hartmann, S. (2013), Project scheduling with resource capacities and requests varying with time: A case study, Flexible Services and Manufacturing Journal, 25(1-2), 74-93.
11. Jarboui, B. , Damak, N. , Siarry, P. , and Rebai, A. (2008), A combinatorial particle swarm optimization for solving multi-mode resource-constrained project scheduling problems, Applied Mathematics and Computation, 195(1), 299-308.
12. Khoshnazar, S. H. and Rezaei, H. (2013), A review of the methods for selecting the initial points in the Kmeans algorithm, The First National Conference on the Application of Intelligent Systems (Soft Computing) in Science and Technology, Islamic Azad University, Quchan Branch, Available from https://www.civilica.com/Paper-AISST01-AISST01_091.html.
13. King, B. (1967), Step-wise clustering procedures, Journal of the American Statistical Association, 62(317), 86-101.
14. Klein, R. (2000), Bidirectional planning: Improving priority rule-based heuristics for scheduling resource constrained projects, European Journal of Operational Research, 127(3), 619-638.
15. Kochetov, Y. and Stolyar, A. (2003), Evolutionary local search with the variable neighborhood for the resource constrained project scheduling problem, Proceedings of the Third International Workshop of Computer Science and Information Technologies, Russia.
16. Kolisch, R. (1995), Project Scheduling Under Resource-Constraints: Efflcient Heuristics for Several Problem Classes, Physica -Verlag, Heidelberg.
17. Kolisch, R. (1996), Serial and parallel resource-constrained project scheduling methods revisited: theory and computation, European Journal of Operational Research, 90(2), 320-333.
18. Kolisch, R. and Sprecher, A. (1996), PSPLIB - A project scheduling library, European Journal of Operational Research, 96, 205-216.
19. Kolisch, R. and Drexl, A. (1997), Local search for nonpreemptive multi-mode resource-constrained project scheduling, IIE Transactions, 29(11), 987-999.
20. Kriegel, H. P. , Kröger, P. , Sander, J. , and Zimek, A. (2011), Density-based clustering, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3), 231-240.
21. Leu, S. S. and Yang, C. H. (1999), GA-based multicriteria optimal model for construction scheduling, Journal of Construction Engineering and Management, 125(6), 420-427.
22. Li, X. , Hu, Z. , and Wu, F. (2007), A note on the convergence of the mean shift, Pattern Recognition, 40(6), 1756-1762.
23. Mejía, G. , Sánchez, M. , Niño, K. , and Figueroa, P. (2013), A petri net based algorithm for the resource constrained project scheduling problem (RCPSP): A real life application in the animation and videogame industry, Proceedings of the 22nd International Conference on Production Research, ICPR 2013, Brazil.
24. Muritiba, A. E. F. , Rodrigues, C. D. , and da Costa, F. A. (2018), A Path-Relinking algorithm for the multimode resource-constrained project scheduling problem, Computers & Operations Research, 92, 145-154.
25. Tirkolaee, B. E. , Goli, A. R. , Hematian, M. , Sangaiah, A. K. , and Han, T. (2019), Multi-objective multi-mode resource constrained project scheduling problem using Pareto-based algorithms, Computer Science, 101(6), 547-570.
26. Xu, J. and Zeng, Z. (2015), Multi-criteria multi-modal fuzzy project scheduling in construction industry, Handbook on Project Management and Scheduling, 2, 1307-1335.
27. Xu, X. , Ester, M. , Kriegel, H. P. , and Sander, J. (1998), A distribution-based clustering algorithm in large spatial data-based, Pocessing of the 14th Internal Conference on Data Engineering (ICDE98), Orlando, Florida, USA, 324-331.
28. Yoosefzadeh, H. R. , Tareghian, H. R. , and Farahi, M. H. (2014), Multi-directional scheduling scheme in resource constrained project scheduling problem, Naval Research Logistics, 61(1), 44-55.
29. Yosefzadeh, H. R. , Tareghian, H. R. , Farahi, M. H. (2010), Tri-directional scheduling scheme: Theory and computation, Journal of Mathematical Modelling and Algorithms, 9(4), 357-373.