1. INTRODUCTION
Substantial growth in wireless sensor networks (WSNs) has generated considerable interest among researchers aiming to provide good quality service at minimal cost. WSN design—a key area that has been addressed in the literature—can be classified into three WSN routing model designs: (a) direct transmission, (b) multihop, and (c) clusterbased hierarchical model designs, as shown in Figure 1. Direct transmission (onehop) model solutions are possible but impractical, whereas multihop models have high latency. Thus, clusterbased hierarchical models are favored for providing highquality WSN service at minimal cost (Ibriq and Mahgoub, 2004).
The design and operation of WSNs require scalable architectural and management strategies. In addition, sensors in such largescale deployments are generally energy constrained, as their batteries cannot be recharged. Because each wireless sensor node has limited energy storage, efficient use of this energy is vital for increasing network lifetimes and the range of suitable applications for these networks (Abbasi and Younis, 2007). Appropriate node clustering can reduce the overall energy usage in a network; thus, minimizing energy consumption is an important consideration when developing clustering schemes (Dechene et al., 2006). Clustering in WSNs has been proven to be an effective approach for organizing a network into a connected hierarchy, balancing the load, and increasing network lifetime. However, this method involves grouping nodes into clusters and electing a cluster head, with the collection of cluster heads in the network forming a connected dominating set. Because obtaining an optimal dominating set is an NPcomplete problem, the proposed algorithms are heuristic in nature (Younis et al., 2006). Further, swarm intelligence has been found to be the most appropriate approach for the design and deployment of WSNs.
Establishing an optimal collection of cluster heads is a difficult combinatorial optimization problem in clusterbased hierarchical WSN models. Discrete particle swarm optimization (PSO) (Eberhart and Shi, 2001) is used for clustering in WSNs with a predetermined number of clusters (Kim et al., 2011;Latiff et al., 2007a). Artificial bee colony (ABC) algorithms have also been proposed for clusterbased WSN routing (Karaboga and Ozturk, 2011), and ant colony optimization (ACO) has been used for sensor network optimization (Kim and Choi, 2009). Ferentinos and Tsiligiridis (2007) considered an integrated genetic algorithm (GA) approach for precision agriculture applications of sensor networks. Kannan et al. (2006) proposed a twophase localization method based on simulated annealing (SA) to accurately estimate node locations. Hussain et al. (2007) used GAs to create energyefficient clusters for data dissemination in WSNs. By clustering a sensor network into numerous independent clusters using a GA, (Jin et al. 2003) minimized the total communication distance, thereby extending the network lifetime. Latiff et al. (2007b) used PSO for energyaware clustering for WSNs. Parwekar and Nagireddy (2018) implemented a fitness equation using PSO and social group optimization (SGO). Parwekar et al. (2018) compared the application of PSO and GAs to WSN, demonstrating that PSO performs better.
The motivation of this study is to compare swarm intelligence to early methods for clusteringbased WSN problems. A clusteringbased WSN problem and the PSO and ABC optimization methodologies are presented herein, and the use of swarm intelligence to maximize fitness and minimize communication distance with an optimal number and location of cluster heads is examined. Furthermore, the results of binary artificial bee colony (BABC) and binary particle swarm optimization (BPSO) are compared with those of other optimization methods; specifically, ACO, SA, and GA.
The remainder of this paper is organized as follows. The WSN clustering design is described in Section 2. Binary swarm intelligence is introduced in Section 3. The experiments and analysis performed to validate the proposed methods are described in Section 4.
2. WSN CLUSTERING DESIGN
Heinzelman et al. (2000, 2002) proposed formulas for the energy dissipation in cluster head E_{CH} and noncluster head E_{nonCH} sensors, which are estimated using radio electronics dissipation for receiving and transmitting units E_{elec}, amplifier energy parameters ∈_{mp} and ∈_{fs}, and energy for data aggregation E_{DA} when an lbit message is transmitted across distance d_{to Sink} (distance from the cluster head node to the Sink) and d_{toCH} (distance from the node to the cluster head) in a WSN. For N nodes and k clusters, the energy dissipated in the cluster head node is found using Equation (1). Further, the energy used in each noncluster head sensor node is found using Equation (2). The longer the communication distance, the more energy is consumed during transmission.
The total transmission distance of the WSN is the main factor that must be minimized. In addition, the number of cluster heads can factor into the function. Given the same distance, having fewer cluster heads results in greater energy efficiency, as cluster head nodes drain more power than noncluster head nodes. The objective function published in (Jin et al. 2003) is modified, as shown in Equation (3), with binary integer decision variable Z_{i}, which is 1 if node i is a cluster head and 0 if node i is a sensor node, where N is the total number of nodes, TD is the total distance of all nodes to the Sink in the transmission onehop model, and any sensor nodes with decision variable Z_{i} = 0 are assigned to the closest cluster head with decision variable Z_{i} = 1, as shown in Equation (5). As the value of d is dependent on the value of decision variable Z_{i}, d is also a decision variable: it is the sum of the distances from the sensor nodes to their closest cluster heads, plus the sum of the distances from all cluster heads to the Sink, as shown in Equation (4). In Equation (4), dist_{ij} is the distance from node i to node j, dist_{0i} is the distance from Sink 0 to node i, and the binary variable X_{ij} is 1 if node i (a cluster head) and node j are connected and is 0 if nodes i and j are not connected, as shown in Equation (6). $\sum}_{i=1}^{N}{Z}_{i$ is the number of cluster heads for a solution, w is the predefined weight value for the distance (TD  d), and (1  w) is the weight for the number of sensors (N  $\sum}_{i=1}^{N}{Z}_{i$ ). This is a binary integer programming (BIP) problem for sensor network optimization. Fitness is calculated with Equation (3) using binary decision variable Z_{i} from Equation (5). The value of w is 0 ≤ w ≤ 1 and is applicationdependent based on whether distance or the number of cluster heads is the more important factor to consider. The predetermined constant weight w = 0.7 is used for the distance factor and weight w = 0.3 for the number of cluster heads factor:
For example, as shown in Table 1, the input data include N = 15 nodes and a Sink node position of (0, 0). The Sink position is (x_{s}, y_{s}) and the node position is (x_{i}, y_{i}). Figure 2(a) is the direct transmission onehop model with TD = 175.657, d = 175.657, and $\sum}_{i=1}^{N}{Z}_{i$ = 0. (There are no cluster heads). As shown in Figure 2(b), sensor nodes 9, 13, and 15 are connected to cluster head 3, which is the closest cluster head; sensor nodes 2, 4, and 8 are likewise connected to cluster head 11; sensor nodes 6, 7, 10, and 12 to cluster head 5; and sensor node 1 to cluster head 14. Figure 2(b) shows the clustering design model with TD = 175.657, d = 79.6004 (i.e., the distance from sensor nodes 9, 13, and 15 to cluster head node 3; from sensor nodes 6, 7, 10, and 12 to cluster head node 5; from sensor node 2, 4, and 8 to cluster head node 11; and sensor node 1 to cluster head node 14; plus the distance from cluster head nodes 3, 5, 11, and 14 to Sink, as shown in Figure 2(b)), and $\sum}_{i=1}^{N}{Z}_{i$ = 4 with cluster heads (●) with binary decision variable Z_{3} = 1, Z_{5} = 1, Z_{11} = 1, and Z_{14} = 1) in Figure 2(b). The total fitness value of Figure 2(a) is 4.8 for the direct transmission onehop model, and that of Figure 2(b) is 70.8396 for optimal clustering design with predetermined weight w = 0.7 using Equation (2). Figures 3(a) and 3(b) are the solution representations of the direct transmission onehop and cluster design models in Figures 2(a) and 2(b), respectively. The randomly generated solution using binary variable Z is always feasible because every node can be a cluster head (Z_{i} = 1) or a sensor node (Z_{i}=0), as shown in Figure 3(b).
3. BINARY SWARM INTELLIGENCE
Sections 3.1 and 3.2 respectively describe BPSO and BABC using the binary integer solution from Figure 3 for the BIP problem in WSNs.
3.1 Binary Particle Swarm Optimization
PSO can efficiently handle clustering or clusterhead selection problems, which are not onetime activities (Kulkarni and Venayagamoorthy, 2011). Therefore, our simple proposed BPSO is a good choice for enabling sensor nodes and cluster heads to adapt to dynamic topological changes in WSN clustering. A BPSO flowchart for WSN clustering design is shown in Figure 4. BPSO improves the initial particles by selecting a new particle that increases the objective function fitness value. The clustering design parameters are selected in Step 1. BPSO starts with nonoptimal (but valid) initial particles (solutions), which can be generated randomly in Step 2. This randomly generated solution is feasible because every node can be a cluster head or sensor node. This initial valid particle is used as an initial BPSO solution. All particles are evaluated using the objective functions in Equations (3)–(4) in Step 3. In Step 4, pbest is the best solution achieved so far by that particle, whereas gbest in Step 5 is the best value obtained so far by any particle in the neighborhood. The basic concept of BPSO lies in accelerating each particle toward its pbest and the gbest, with a weighted acceleration (the weight values of ${X}_{1}^{1}$, $pbes{t}_{1}^{1}$ , and $gbes{t}^{1}$ are a, ${r}_{1}\times {c}_{1}$ , and ${r}_{2}\times {c}_{2}.$ ${c}_{1}$ and ${c}_{2}$ are two positive constants, referred to as the cognitive and social acceleration factors respectively; r_{1} and r_{2} are random numbers within the range [0, 1]) at each time using Equations (7) and (8) (Kennedy and Eberhart, 1997;Latiff et al., 2007a). Early experience (trial and error, mostly) when using PSO led us to the chosen acceleration parameters (Eberhart and Shi, 2001).

gbest^{i} is assigned the value of the particle with the best performance so far (by generation i) of all the particles
In the BPSO methodology, we can show the generation of new particles (solutions) using the following example. As shown in Figure 5, BPSO begins with 10 binary integer particles, ${X}_{1}^{1},\hspace{0.17em}{X}_{2}^{1},\hspace{0.17em}{X}_{3}^{1},\hspace{0.17em}{X}_{4}^{1}\dots {X}_{10}^{1}$ in generation 1. A new particle ${X}_{1}^{2}$ is generated based on ${X}_{1}^{1}$ , $pbes{t}_{1}^{1}$ , and $gbes{t}^{1}$, based on Equations (5) and (6) in Steps 68, as shown in Figures 4 and 5.
The fitness value of the objective function should be maximized. The evaluation values of ${X}_{1}^{1}$ , $pbes{t}_{1}^{1}$ , and $gbes{t}^{1}$ are $f({X}_{1}^{1})$ = 47.2821, $f(pbes{t}_{1}^{1})$ = 60.5224, and $f(gbes{t}^{1})$ = 63.8903, respectively, using fitness Equations (5)–(6). The weights of ${X}_{1}^{1}$ , $pbes{t}_{1}^{1}$ , and $gbes{t}^{1}$ are assumed to be a = 2.5, ${r}_{1}\times {c}_{1}=3.3$ , and ${r}_{2}\times {c}_{2}=0.2$ , respectively, as shown in Equation (7).
The probabilities that ${X}_{1}^{2}$ is 1 at (a) (1, 1, 1), (b) (0, 0, 0), and (c) (1, 0, 1), based on ${X}_{1}^{1}$ , $pbes{t}_{1}^{1}$ , and $gbes{t}^{1}$ , are 1 (0.3574+0.6039+0.0387), 0, and 0.3961 (0.3574+ 0.0387), as shown in Figure 5. The probabilities 1, 0, and 0.3961 map to v_{min} = 4 and ${\text{v}}_{\text{max}}=4$ , $4\le {v}_{i}(t+1)\le 4$ . Del Valle et al. (2008) reported that is typically set to 4.0 so that there is always at least a probability of 0.018 for any bit to change its state.
We can find the probability of assigning 1 for a cluster head at (a), (b), and (c) of new solution ${X}_{1}^{2}$ by using the sigmoid function shown in Equation (9).
Cell 5 of ${X}_{1}^{2}$ can be assigned to a cluster head with probability 98.2% at position (a), as shown in Figure 5. If the random number from 0–1 is less than 0.982, then node 5 is assigned as a cluster head; otherwise, node 5 is assigned as a sensor. Similarly, node 6 of ${X}_{1}^{2}$ can be assigned as a cluster head with probability 1.8% at position (b) and node 7 of ${X}_{1}^{2}$ can be assigned as a cluster head with probability 30.34% at position (c).
3.2 Binary Artificial Bee Colony
The ABC algorithm was motivated by the intelligent foodfinding behaviors of honeybees (Karaboga and Basturk, 2007). The performance of the ABC algorithm is better than or similar to those of other populationbased algorithms, with the advantage of employing fewer control parameters (Karaboga and Basturk, 2008;Karaboga and Akay, 2009). The only important control parameter of the ABC is Limit, which is the number of trials after which a food source is assumed to be abandoned. Zhang et al. (2010) stated that the performance of ABC for clustering in data analysis and mining is faster and it produces shorter intracluster distances than other heuristic algorithms. Karaboga and Ozturk (2011) also put forth that ABC is successful in clustering for classification purposes. A flowchart of the proposed ABC with ranking strategy, used in this study for WSN clustering design, is shown in Figure 6.
The following steps comprise the standard ABC. In Step 1, Limit and the number of food sources are set to SN. The population of food sources (solutions) is initialized in Step 2, and this randomly generated solution is feasible because every node can be a cluster head or sensor node. The initial solutions in the population are evaluated using Equations (3)–(4). The value of food source i is set to f_{i}. In Step 3, new food sources (solutions) are produced for the employed bees and evaluated. A greedy selection process is applied for the employed bees and probability p_{i} is calculated for the solutions using Equation (10).
The new solutions for the onlookers are produced from the solution selected depending on p_{i} in Step 4 and evaluated. A greedy selection process is applied to the onlookers. The abandoned solution for the scout is determined, if it exists, and is replaced with a new random solution in Step 5. The best solution achieved so far is saved. These steps are repeated until a termination criterion is met (Karaboga and Basturk, 2007, 2008; Karaboga and Akay, 2009).
Karaboga et al. (2010) used this ABC algorithm to propose a novel hierarchical clustering approach for minimizing energy depletion in WSNs. However, in standard ABC algorithms, new solutions arbitrarily generated by scout bees are of poorer quality than existing solution groups that are repeatedly improved. Although the generation of arbitrary solutions can search a wide range of solutions, the overall search performance, in terms of finding an optimal solution, can decrease as the number of solutions increases. To solve this problem and to generate new solutions whose quality is not significantly different from that of existing solutions despite adding variety, this section of the paper proposes a BABC method with a ranking strategy. The ranking strategy is added to the standard BABC before Step 5, at which point the worst solution should be removed from the population. The new solution (food source) is generated using some of the best existing solutions and this new good solution is added to the population.
The four best solutions, which are food sources 1–4 in the current population, are selected to generate a new solution (food source), as shown in Figure 7. Probabilities of 1, 0, and 0.5180 are proportionally calculated for new food sources at (a) (1, 1, 1, 1), (b) (0, 0, 0, 0), and (c) (0, 1, 1, 0), respectively, based on food sources 1–4.
The probabilities that the new food sources at (a), (b), and (c) are 1 (based on four food sources), are 1, 0, and 0.5180, as shown in Figure 7. These probabilities map to ${\text{v}}_{\text{min}}=4$ and ${\text{v}}_{\text{max}}=4$ , and $4\le {v}_{i}(t+1)\le 4$ (Del Valle et al., 2008).
We can find the following probability to assign to reporting cells at (a), (b), and (c) for the new solution by using the sigmoid function in Equation (9):
Node 5 of the new food source can be assigned as a cluster head with probability 98.2% at position (a) in Figure 7. If the random number from 0 to 1 is less than 0.982, then cell 5 is assigned as a cluster head. Otherwise, node 5 is assigned as a sensor. Similarly, nodes 6 and 7 of the new food source can be assigned to cluster heads with probabilities 1.8% at position (b) and 53.59% at position (c), respectively.
4. EXPERIMENT AND ANALYSIS
Simulation were conducted on an Intel^{(R)} Core^{(TM)} 2 Duo CPU (2.66 GHz, 2 GB RAM) personal computer. Different clustering design problems were used: a small network (100 nodes) with origin (0,0) or center (50,50) Sink; a medium network (200 nodes) with origin (0,0) or center (100,100) Sink; and a large network (400 nodes) with origin (0,0) or center (200,200) Sink. The node positions were randomly generated in the range $\left(0\le {x}_{i}\le 100,\hspace{0.17em}0\le {y}_{i}\le 100\right)$ .
The convergence trends for different problems using BPSO and BABC are shown in Figures 8(a)–(l). The best solution converges at around generation 3000 in BPSO and 2000 in BABC for the small and medium networks with origin and centerlocated Sink test problems in Figures 8(a)–(h). The best solutions finally converge at around generation 3000 with BPSO and 5000 with BABC for the large networks with origin and centerlocated Sink test problems in in Figures 8(i)–(l).
The results of BPSO and BABC were compared to solutions using ACO, GA, and SA with ten runs for each algorithm, as shown in Table 2. We implemented ACO based on Kim and Choi (2009), GA based on (Jin et al. 2003), and SA based on Kannan et al. (2006), including their operations and parameter conditions, with onedimensional solution representation for small, medium, and large test problems. The WSN design test problems were simulated using CPLEX for comparison to the best BABC (with 0–0.05% error) and the worst ACO (with 2.67–13.77% error) to determine optimum results. The best BABC results were close to the optimal solutions using CPLEX with very small error percentages.
The execution time of our proposed method increases reasonably, whereas that of CPLEX increases significantly, with network size. El Rhazi and Pierre (2009) compared the execution times of their Tabu search approach with those of CPLEX, finding that they are highly similar for networks of fewer than 700 nodes (although they do not show the exact computation times, Figure 6 in their paper shows that it is very small). However, the computation times of 800 and 900 node networks are around 15,000 s and 40,000 s. In our experiments, when using CPLEX, the optimal numbers of cluster heads were 11 (CPU time 0.34 s) and 19 (0.14 s) in the small (100 node) networks, 17 (2.64 s) and 28 (2.75 s) in the medium (200 node) networks, and 28 (15.06 s) and 42 (13.56 s) in the large (400 node) networks in Tables 2 and 3. The optimal numbers of cluster heads using CPLEX are not available in Table 3 for the very large (1000 node) networks because CPLEX could not get a solution within 50,000 s. In contrast, our method can return feasible solutions in small networks (CPU time 0.25 s and 0.4 s), medium networks (1 s and 1.5 s), large networks (4 s and 6 s), and very large networks (1,000 s and 1,500 s) within reasonable computation times.
BPSO, BABC, and CPLEX had the best fitness values (4159.84), total transmission distance (d = 1713.30), and number of cluster heads ( $\sum}_{j=1}^{N}{Z}_{j$ =11) based on origin Sink (0, 0) and best fitness value (1774.39), d =1271.77, and number of cluster heads ( $\sum}_{j=1}^{N}{Z}_{j$ =19) based on center (50, 50) Sink with the total distance of all nodes to the Sink in direct transmission onehop model (TD = 2210.42) for Equation (3) for the small sized (100 node) network. These total transmission distances can be calculated with the locations of the Sink and sensor nodes, including the 11 cluster heads in Figure 9(a) and 19 cluster heads Figure 9(b), for each test problem. The total energy consumption can be calculated using Equations (1) and (2) for energy dissipated in the cluster heads and sensor nodes.
The two best methods, BPSO and BABC, were compared to CPLEX in detail, as shown in Table 3. The average and best fitness values of BABC were slightly better than those of BPSO. BABC is more robust because its standard deviation is much smaller than that of BPSO. The distribution of the best solutions for different clustering design problems using BABC are shown in Figures 9(a)–(h). The big black circle (●) is the Sink and the small black circles (⚫) are cluster heads. The other markers are sensors that are connected to the cluster heads. For the very large (1000 node) sensor network, a CPLEX solution was not obtained, although solutions using BPSO and BABC were obtained within the limited computation time. Binary swarm intelligence approaches are more valuable for obtaining optimal solutions within reasonable computation time as the number of nodes in the network increases.
5. CONCLUSIONS
This study focused on the BIP problem of designing a clusteringbased WSN and presented binary swarm intelligence algorithms—BABC and BPSO—to obtain a binary integer solution. WSN performance was compared for BABC and BPSO, along with ACO, SA, and GA. Small (100 nodes), medium (200 nodes), large (400 nodes), and very large (1000 nodes) test networks were used to verify the proposed approaches. BABC and BPSO obtained better clustering solutions than earlier methods in terms of the test networks’ fitness. The best solution, average, and standard deviation of fitness using swarm intelligence methods were compared to the optimal solution using CPLEX in sensor network problems. Although the use of swarm intelligence does not guarantee an optimal solution for a given network, the proposed approaches were valuable for obtaining the best solutions; as the number of nodes in the network increases, CPLEX cannot find the optimal solution for large networks (see the 1000 node network test problem in Section 4) within a reasonable computation time.