ISSN : 2234-6473 (Online)

**Industrial Engineering & Management Systems Vol.11 No.4 pp.385-389**

DOI : https://doi.org/10.7232/iems.2012.11.4.385

# Multivariate Process Control Chart for Controlling the False Discovery Rate

### Abstract

^{2}statistics areused for computing the corresponding p-values and the procedure for controlling the false discovery rate in multiplehypothesis testing is applied to the proposed control scheme. Some numerical simulations were carried out to comparethe performance of the proposed control scheme with the ordinary multivariate Shewhart chart in terms of the averagerun length. The results show that the proposed control scheme outperforms the existing multivariate Shewhart chartfor all mean shifts

- 1. INTRODUCTION
- 2. MULTIVARIATE CONTROL SCHEME FOR CONTROLLING THE FDR
- 3. NUMERICAL EXPERIMENTS
- 3.1 Two-Dimensional Quality Variables
- 3.2 Three-Dimensional Quality Variables
- 4. CONCLUSION
- ACKNOWLEDGMENTS

### 1. INTRODUCTION

Traditionally, quality is essential part of manufacturing in various industries, such as the chemical, semiconductor, automobile, computer, and cell phone industries. These days the importance of quality is emphasized also in service industries such as the banking, telecommunications, and health care industries. Customers’ demand for better quality products is growing stronger especially since they can now share the knowledge about quality through the Internet and social networks. Therefore, quality improvement is one of the most important aspects of business. Montgomery (2007) defines the term “quality improvement” as the reduction of variability in processes and products. There are two causes of variability in processes. The first cause is “chance cause” derived from random effects such as weather conditions. Chance cause is considered as natural. The other cause is “assignable cause” such as a failure in machinery or faulty raw materials. Assignable cause can be controlled by theoretical methodology. Therefore, many companies have tried to reduce the assignable cause.

Although there are various methods to improve quality, statistical process control (SPC) is regarded as the most scientific and valid approach. In most industries, the univariate quality variable is used for separately monitoring key measurements on final products which in some way define the quality of that product (Mac- Gregor and Kourti, 1995). However, the univariate control chart cannot detect whether the variables are correlated with each other. To overcome this difficulty, several multivariate control charts have been proposed based on chi-squared statistics and Hotelling’s T^{2} statistics. These charts can be considered an extension of each univariate control chart. The multivariate Shewhart control charts are an extension of the X -control chart (Hotelling, 1947) which is the simple and most widely used.

Control charts can be interpreted using a p-value approach. By using a p-value approach, we get several advantages over the traditional control charts. First, a pvalue approach offers better graphical displays of the performance of the process and incorporates more complex control procedures (Benjamini and Kling, 1999). Li et al. (2012) showed that we can determine how strong the signal is and how stable the process performs at a given time using univariate cumulative sum (CUSUM) charts. If we adopt a p-value approach, the control charts can be considered as a sequential single hypothesis test. Therefore, if we can establish the distribution of plotted statistics, we can control quality more easily by setting only type I error α (Lee and Jun, 2010, 2012). Finally, we can apply a multiple comparison procedure to control charts by testing single hypothesis simultaneously (Benjamini and Kling, 1999).

In statistics, the “multiple comparisons” or “multiple testing” problem occurs when one considers a set of statistical inferences simultaneously (Miller, 1981). Since single hypothesis testing increases the false positive rate when various hypotheses are tested, the family wise error rate (FWER) is used in multiple comparison procedures. The FWER is the probability that at least one false positive or type I error will occur among all the hypotheses tested. Many procedures to control the FWER have been proposed such as the Bonferroni, Sidak, Tukey’s, Holm’s step-down, and Hochberg’s step-up procedures. However, these procedures are not widely used. They give conservative results when the number of hypotheses increases. Therefore, the utility of testing decreases. To overcome the weaknesses of the FWER, Benjamini and Hochberg (1995) proposed using the false discovery rate (FDR). The FDR is the expected portion of false positives among all the rejected hypotheses. They also proposed a procedure for controlling the FDR.

There has been an effort to apply the FDR to univariate control charts. Lee and Jun (2010, 2012) proposed procedures to control FDR for univariate X -charts and exponentially weighted moving average (EWMA) charts. They showed that by controlling the FDR, X -charts and EWMA charts give better performance than traditional control charts.

Grown out of these motivations, the objective of this paper is to provide a new multivariate process chart by controlling the FDR. The remainder of this paper is organized as follows. The multivariate process control is interpreted in terms of p-values in Section 2. Section 2 also proposes a new multivariate control scheme for controlling the FDR. Section 3 compares the performances of the new control schemes using numerical experiments. Finally, Section 4 gives the conclusion.

### 2. MULTIVARIATE CONTROL SCHEME FOR CONTROLLING THE FDR

Multivariate Shewhart control chart (Hotelling, 1947) is used for more than two quality variables or equal to two quality variables. Basically, it uses Hotelling’s T^{2}. Suppose that there are q quality variables and the total number of observations is equal to m. Also, suppose that the single observation vector follows a multivariate normal distribution with mean vector μ_{0 } and covariance matrix Σ. The proposed method controls no single observation but subgroup. If the size of subgroup is n, the mean vector of j^{th } subgroup, X_{j } (j = 1, 2, 3, …) is like the following equation.

The vector X_{ij} , is i^{th} observation which is included in j^{th} subgroup. Therefore, the covariance matrix of subgroup is given by

If μ_{0} = (μ_{01}, μ_{01}, … , μ_{0q})^{T} is the target mean quality, then the statistics of the j^{th} subgroup of the Multivariate SPC chart is defined by Hotelling’s T^{2} statistics as (Anderson, 1958),

Since the control chart controls the quality when the current condition is in an in-control state, T^{2} follows the Hotelling distribution with a degree of freedom (q, n-q).

The Hotelling T^{2} distribution is related to the more familiar F distribution. The relationship between the Hotelling T^{2} and the F distributions is

The upper control limit (UCL) of a multivariate SPC chart can be set using the above relationship, while the lower control limit (LCL) is 0 and the UCL is obtained using Eq. (6).

Here F_{α,q,n−q } is 100 × α% of the critical point of the F distribution with a degree of freedom (q, n-q). Therefore, the multivariate SPC chart indicates an out-ofcontrol state when T^{2} is greater than or equal to . _{ }

The multivariate Shewhart control scheme is considered to be a sequential single hypothesis testing which is described as,

If the null hypothesis μ_{j} = μ_{0} is true, then the statistic T_{j}^{2} follows the Hotelling T^{2} distribution with a degree of freedom (q, n-q). Therefore, each subgroup’s pvalue is

where is the tail probability of the F distribution with a degree of freedom (q, n-q). The control chart indicates an out-of-control state when the j^{th} p-values, p_{j} , is less than or equal to a type I error α.

Anderson (1958) proved that if the null hypothesis is not true, the statistics T^{2} follow a generalized T^{2} distribution ( T^{2′} ) with a degree of freedom (q, n-q). He also derived the relationship between the generalized T^{2} distribution and the non-central F distribution ( F′ ) to be

If τ is 0, the non-central F distribution is exactly the same as the F distribution.

Step 1: For each subgroup, compute T^{2} statistics.

Step 2: For each T^{2} statistic, compute the p-value.

Step 3: Apply the Benjamini and Hochberg procedure.

1) Specify the FDR level q.

2) From the current testing point t to the previous testing point t-r+1, sort the p-values in increasing order. If the ordered p-values are p_{(1) , } p_{(2)}, …, p_{(r-1)} , p_{(r)} the corresponding hypotheses are H_{(1)}, H_{(2)},…, H_{(r-1)}, H_{(r)}3) If certain p_{(i)} (I = 1, …, r) values satisfy

**Table 1.** ARL of multivariate Shewhart control chart for two-dimensional quality variables

the current state is considered to be out-of-control and the hypothesis H_{(i)} is rejected.

### 3. NUMERICAL EXPERIMENTS

In this section, some numerical experiments are performed to compare the BH scheme with the multivariate Shewhart control chart. A theoretical average run length (ARL) for the multivariate Shewhart control chart is compared with a simulated ARL for the BH scheme. Since computing the simulated ARL for the BH scheme is difficult, the Monte Carlo simulation approach is used. In this section, two- and three-dimensional quality variables are used for experiments.

#### 3.1 Two-Dimensional Quality Variables

First, the theoretical ARL is compared with the simulated ARL using the p-value approach for the conventional multivariate Shewhart control chart. For twodimensional quality variables, the mean vector (0, 0)^{T} , covariance matrices (1 0.2; 0.2 1), (1 0.5; 0.5 1), and (1 0.8; 0.8 1) are used to compare the theoretical ARL and the simulated ARL. For the theoretical ARL, Eqs. (9) and (10) are used. For the simulated ARL, 10,000 iterations replicated 20 times are performed to compute an average value. Various mean shift sizes (0, 0.5, 1, 1.5, 2, 2.5, 3, 4, 5) were used for experimental purposes. If a mean shift size is 0.5, a shifted process has a mean vector of (0.5, 0.5)^{T}, and the covariance matrix is unchanged. The critical value α was set at 0.005. The results of this experiment are provided in Table 1. Table 1 indicates that the simulated results are almost exactly the same as the theoretical results. Therefore, a p-value approach is stable and appropriate for use with a multivariate Shewhart control chart. Also, the out-of-control ARLs are different given the same mean shifts level. Table 2 shows the difference between maximum and minimum eigenvalue for two-dimensional covariance matrix. Tables 1 and 2 represent that the out-of-control ARL drops sharply when the difference between maximum and minimum eigenvalues is small. In other words, when the quality variables are independent, the control charts detect the out-of-control signal quickly.

The mean vector (0, 0)^{T} and only the covariance matrix (1 0.5; 0.5 1) are used to compare ARLs. The span size for BH scheme was r = 10, 20, 30. For multivariate Shewhart control chart ARLs, theoretical values were used. For the BH scheme simulation, 10,000 iterations replicated 20 times were used to compute average values. Various mean shift sizes (0, 0.5, 1, 1.5, 2, 2.5, 3, 4, 5) were used for experimental purposes, and the critical value α was set at 0.005. It is axiomatic that a shorter ARL_{1 }corresponds to a better control chart given the same ARL_{0}. A trivial case is where the ARL_{0} of the MSCS is 1/α. Some experiments show that the ARL_{0} of the BH-scheme is approximately 1/α when the FDR level of the BH-scheme is q = α × r. Therefore, in this experiment, the FDR level was set at α × r. The results of this experiment are listed in Table 3. The ARL_{1} of the BH scheme is smaller than that of the multivariate Shewhart control chart for all mean shift sizes. Therefore, the BH scheme performs better for two-dimensional quality variables. Also, in the case of the BH scheme, a larger span size results in better performance when the mean shift is small. These tendencies are the same in the various covariance matrix cases.

**Table 2.** Maximum and minimum eigenvalue for two-dimensional covariance matrix

**Table 3.** ARLs of BH scheme and multivariate shewhart control chart for two-dimensional quality variables

**Table 4.** ARLs of BH scheme and multivariate Shewhart control chart for three-dimensional quality variables

#### 3.2 Three-Dimensional Quality Variables

For three-dimensional quality variables, the same procedure is adopted to compare the performance of the BH scheme and multivariate Shewhart control chart. To compare the performance of the BH scheme with the multivariate Shewhart control chart, the mean vector (0, 0, 0)^{T} and only the covariance matrix (1 0.5 0.5; 0.5 1 0.5; 0.5 0.5 1) are used. Other conditions such as the number of iterations, number of replications, critical level α and mean shift size were the same as those in the two-dimensional analysis set forth above. Table 4 lists the results of this experiment. The interpretation of these results is also the same as in the case of twodimensional quality variables. In other words, the BH scheme performs better for three dimensional quality variables. Also, in the case of the BH scheme, a larger span size results in better performance when the mean shift is small.

### 4. CONCLUSION

This paper proposed a new multivariate process control scheme, which intends to control the false discovery rate. The BH procedure is incorporated to control the FDR in the sense of multiple hypothesis testing. First, some simulation studies showed that the use of pvalues in multivariate Shewhart control chart is appropriate. Finally, it was shown that the proposed control scheme outperforms the conventional chart in two- and three- dimensional quality variables in terms of ARL.

### ACKNOWLEDGMENTS

This research was supported by Basic Science Research Program through the National Research Foundation of Korea from the Ministry of Education, Science and Technology (Project No. 2012-0001665).

### Reference

2.Benjamini, Y. and Hochberg, Y. (1995), Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society B Methodological, 57(1), 289- 300.

3.Benjamini, Y. and Kling, Y. (1999), A look at statistical process control through the p-values, Research Paper: RP-SOR-99-08, Tel Aviv University, School of Mathematical Science, Israel.

4.Hotelling, H. (1947), Multivariate quality control, illustrated by the air testing of sample bombsights. In: Eisenhart, C. (ed.), Selected Techniques of Statistical Analysis for Scientific and Industrial Research, and Production and Management Engineering, Mc- Graw-Hill Books, New York, NY.

5.Lee, S. H. and Jun, C. H. (2010), A new control scheme always better than X-bar chart, Communications in Statistics-Theory and Methods, 39(19), 3492-3503.

6.Lee, S. H. and Jun, C. H. (2012), A process monitoring scheme controlling false discovery rate, Communications in Statistics-Simulation and Computation, 41(10), 1912-1920.

7.Li, Z., Qiu, P., Chatterjee, S., and Wang, Z. (2012), Using p values to design statistical process control charts, Statistical Papers, 1-17.

8.MacGregor, J. F. and Kourti, T. (1995), Statistical process control of multivariate processes, Control Engineering Practice, 3(3), 403-414.

9.Miller, R. G. (1981), Simultaneous Statistical Inference, Springer-Verlag, New York, NY.

10.Montgomery, D. C. (2007), Introduction to Statistical Quality Control, Academic Internet Publishers, Ventura, CA.