• Editorial Board +
• For Contributors +
• Journal Search +
Journal Search Engine
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.17 No.2 pp.184-192
DOI : https://doi.org/10.7232/iems.2018.17.2.184

Interval Estimation for Mean Response in a Simple Nested Error Regression Model in Gauge R & R Study

Dong Joon Park*
Department of Statistics Pukyong National University, Busan, Republic of Korea
Corresponding Author, djpark@pknu.ac.kr
January 31, 2017 June 21, 2017 July 11, 2017

ABSTRACT

In a measurement system monitoring manufacturing processes, one often uses gauge repeatability and reproducibility (R & R) experiment to determine the amount of variability due to parts, final products, operators, or gauge. A measurement system that employs operators randomly chosen to conduct measurements on randomly selected parts can be statistically modelled. When a measurement is assumed to be linearly related to a predictor variable and a simple nested error regression model is applied for the measurement system, one might be interested in making inferences concerning the variability for the mean response in the model. In general, confidence intervals are uniformly more informative than hypothesis tests in making inferences to parametric values in statistical models. Confidence intervals for the mean response in a simple nested error regression model can be useful tools to determine whether the variability is appropriately managed in a manufacturing process. Several confidence intervals for the mean response in the model are proposed. The confidence intervals proposed in this article are based on a moderate large sample method. Computer simulation is performed to see if the proposed intervals maintain the stated confidence level and a numerical example is provided to explain the proposed intervals by calculating confidence intervals for the mean response in the model.

1. INTRODUCTION

A gauge is employed to collect replicated measurements on units by several different operators, setups, or time periods in measurement system. The measurements in manufacturing processes include variability due to parts, operators, and gauges. The data collected during measuring processes are used to monitor to decrease the variability generated in production. When a measurement is linearly related to a predictor variable and a simple nested error regression model is applied for the measurement system, one might be interested in making inferences concerning the variability for the mean response in the model. The simple nested regression model considered in this article is a simple regression model with nested error structure where the response variable is linearly related to a predictor variable in manufacturing processes for gauge R & R study. Burdick et al. (2005) define that repeatability represents the gauge variability when it is used to measure the same unit (with the same operator or the same setup or in the same time period). Reproducibility refers to the variability arising from different operators, setups, or time periods. Measurement system capability studies are often referred to as gauge repeatability and reproducibility (R & R) study. The sources of variability in gauge R & R study are expressed as variance components so that confidence intervals for the variance components are employed to determine the adequacy of manufacturing for gauge R & R study.

This article focuses on confidence intervals for the mean response in a simple nested regression model which models a manufacturing process. This simple nested regression model includes two error terms. One is associated with the first stage sampling unit which is an operator effect and the other with second stage sampling unit which is a measurement error. The two error terms are assumed to be independent and normally distributed with zero means and constant variances. This article derives several confidence intervals for the mean response in the simple nested regression model.

2. LITERATURE REVIEW

A simple nested error regression model applying for a manufacturing process in this article can be viewed as a statistical model for clustered observations where subjects are nested within groups in the field of educational or biological sciences. Nested data structures are easily found in two types of structures: repeated measures within individuals or students within schools. The data structures are commonly appeared when multi-stage sampling procedures are taken from a target population. The data with this structure are written as nested error structure models, hierarchical linear models, or multilevel models. Observations within clusters tend to be more homogeneous than those between clusters. Misuse of standard ordinary least squares regression when clustered data exist violates the assumptions of data structures.

A great deal of research on statistical models with hierarchically structured data has been conducting by researchers (Goldstein, 2011; Raudenbush, 1995; Longford, 1995). They started with the simplest multilevel model with each group’s intercept, , which is at level 1. They then introduce level 2 random variables into the simplest multilevel model by replacing αj by and βj by $β 1 j = β 1 + O 1 j$. The level 1 variables are observations within clusters and level 2 variables observations between clusters. Goldstein (2011) referred to this model as a 2-level model with hierarchically structured data. This model becomes the simple nested error regression model in this article by letting β0, O0j and O1j be α , Oi and 0 respectively. We assumed that a predictor variable Xij and the regression coefficients α and β are fixed factors and Oi and Eij are random factors where $O i ~ N ( 0 , σ O 2 )$ and $E i j ~ N ( 0 , σ E 2 )$ in the model.

Parameters, α and , β and variance components, $σ O 2$ and $σ E 2$ in multilevel models have been estimated by applying for variance component models, random slopes models, ordinary regression models, etc. (Huang, 2016; McNeish and Stapleton, 2016; Mass and Hox, 2004; Aitkin and Longford, 1986). They have also compared the results of estimation methods such as maximum likelihood estimation, restricted maximum likelihood estimation, ordinary least squares estimation, or robust estimation, depending on various situations and assumptions of multilevel models. Although maximum likelihood estimation and ordinary least squares estimation are not generalized to all situations, using ordinary least estimation is worth considering due to its parsimonious nature (Huang, 2017). The ordinary least squares regression that ignores the nested error structure can lead to substantially biased estimates of regression estimators and their standard errors (Goldstein et al., 1993; Longford, 1986). Their research on multilevel models has mostly concentrated on Point estimation of the parameters, bias of standard errors, and statistical significance.

Park and Burdick (1993, 1994) provided confidence intervals for regression coefficients and variance components in the regression model with one-fold nested regression model. Park (2013) and Park and Yoon (2016) conducted further study to propose confidence intervals for regression coefficients and variance components in a regression model with two-fold nested error structure using various estimation methods. To our knowledge, the inference concerning expected mean response in the nested error regression model for a gauge R & R study has not actively researched up until now. We make, therefore, an attempt to provide confidence intervals for mean response in the model by extending the researches of Park and Burdick (1993, 1994) and Park and Hwang (2002). We specifically aim to present alternative confidence intervals for mean response by relaxing the condition for the mean value of predictor variable, i.e., when an individual value of predictor variable Xij is given rather than calculating the mean value of predictor variable Xi.

3. MODEL STATEMENT

Consider that a manufacturing process where parts are measured by operators and the response variable as a measurement is linearly related to a predictor variable. The simple nested error regression model is then written as

(1)

where Yij is the j th measurement of a part measured by the ith operator as a response variable, α and β are regression parameters, Xij is a fixed predictor variable related to the response variable Yij ,Oi is the ith randomly chosen operator effect, and Eij is the jth measurement error of a part measured by the ith operator. Operator effect Oi is associated with the first stage sampling unit and measurement error effect Eij with second stage sampling unit. Two error terms Oi and Eij are jointly independent normal random variables with zero means and variances $σ O 2$ and $σ P 2$, respectively. That is, $O i ~ N ( 0 , σ O 2 )$ and $E i j ~ N ( 0 , σ E 2 )$. Since α ,β , and Xij are fixed factors and Oi and Eij are random factors, equation (1) is a mixed model.

The model (1) is written in matrix notation(2)

(2)

where y is an IJ ×1 vector of measurements, X is an IJ × 2 matrix of predictor variables with a column of 1’s in the first column and a column of Xij′s in the second column, β is a 2×1 vector of parameters with α and β as elements, Z is an IJ × I design matrix with 0’s and 1’s, i.e. $Z = ⊕ i = 1 I 1 J , 1 J$ is a vector of 1’s, o is an I × 1 vector of operator effects, e is an IJ × 1 vector of measurement errors. By the assumptions in model (1) the response variables have a multivariate normal distribution as follows:

$y ~ N ( X β , V )$
(3)

where $V = σ O 2 Z Z ′ + σ E 2 D I J$ and DIJ is an IJ × IJ identity matrix.

A possible partitioning for source of variability of model (1) is written in Table 1. The sums of squares in Table 1 is defined as follows: and where $S y y w = Σ i Σ j ( Y i j − Y ¯ i . ) 2 ,$ $S x y w = Σ i Σ j ( X i j − X ¯ i . ) ( Y i j − Y ¯ i . ) ,$ $Y ¯ i . = Σ j Y i j / J ,$ $X ¯ .. = Σ i Σ j X i j / I J ,$ and $Y ¯ .. = Σ i Σ j Y i j / I J .$ Three estimators of regression coefficient β used in Table 1 are as follows: $β ^ c = ( S x y a + S x y w ) / ( S x x a + S x x w ) ,$ $β ^ A = S x y a / S x x a ,$ and $β ^ W = S x y w / S x x w$.

4. DISTRIBUTIONAL RESULTS OF ESTIMATORS OF MEAN RESPONSE AND SUMS OF SQUARES

In order to construct confidence intervals for the mean response E(Yij ) , the estimators of mean response and their distributional properties are derived. Three estimators of the mean response are presented. The independence between estimators of mean response and sums of squares in Table 1 is examined.

4.1. Within Group Estimator of Mean Response

The mean response of the simple nested error regression model (1) is defined as where E(Yij ) means the expected value of the jth measurement of a part measured by the ith operator. Park and Burdick (1993) showed that within group ordinary least squares estimators (OLSE) and $β ^ W$ of parameters in model (1) are obtained from regression of Yij on the Xij and the grouping variables. In matrix notation, the estimators are the first two elements of the vector $( X * ′ X * ) G X * ′ y$ where $X * = [ X Z ]$ and $( X * ′ X * ) G$ is a generalized inverse of $X * ′ X * .$. The within group slope OLSE is shown as $β ^ W = S x y w / S x x w$ and it is normally distributed, i.e. $β ^ W ~ N ( β , σ E 2 / S x x w ) .$ It can be shown by elementary algebra that the within group intercept OLSE is shown as Since an unbiased estimator of E(Yij ) is $Y ^ i j$, the within group OLSE of the mean response E(Yij) is obtained by substituting within group OLSEs $α ˜ W$ and $β ^ W$ for parameters in model (1), i.e. $Y ˜ i j W = α ˜ W + β ^ W X i j$. It can be shown by the assumptions of model (1) that within group OLSE of E(Yij ) is normally distributed as $Y ˜ i j w ~ N ( E ( Y i j ) , ( J σ O 2 + σ E 2 ) [ 1 / I J + ( X i j − X ¯ .. ) 2 ϕ / S x x w ] )$ where $ϕ = σ E 2 / ( J σ O 2 + σ E 2 )$.

4.2. Total Estimator of Mean Response

Total OLSEs $α ^ C$ and ˆβC of parameters in model (1) are obtained from regression of Yij on Xij . Park and Burdick (1993) showed that the total OLSEs are obtained by the vector $( X ′ X ) − 1 X ′ y$. The total slope OLSE is written as $β ^ C = ( S x y a + S x y w ) / ( S x x a + S x x w )$ and it is normally distributed, i.e. $β ^ C = ( S x y a + S x y w ) / ( S x x a + S x x w )$ where $r 2 = S x x a / ( S x x a + S x x w )$. The total intercept OLSE is shown to be $α ^ C = Y ¯ .. − β ^ C X ¯ .. .$ The total OLSE of the mean response E(Yij) is therefore obtained by substituting total OLSEs $α ^ C$ and $β ^ C$ for parameters in model (1), i.e. . It can be shown by the assumptions of model (1) that total OLSE of E(Yij ) is normally distributed as $Y ^ i j C ~ N ( E ( Y i j ) , ( J σ O 2 + σ E 2 ) [ 1 / I J + ( r 2 J σ O 2 + σ E 2 ) ( X i j − X ¯ .. ) 2 / { ( J σ O 2 + σ E 2 ) ( S x x a + S x x w ) } ] ) .$

4.3. Best Linear Estimator of Mean Response

Park and Burdick (1994) derived among groups OLSEs $α ^ A$ and $β ^ A$ of parameters in model (1) by regression of $Y ¯ i .$ on $X ¯ i .$. and the among groups OLSEs are obtained by the vector $( M ′ X M X ) − 1 M ′ X y$ where and $M = ( 1 / J ) Z .$ They showed that the best linear unbiased estimator (BLUE) $β ^ B$ of slope parameter in model (1) is a linear combination of $β ^ A$ and $β ^ W$. It can be shown by matrix algebra that the BLUEs $α ^ B$ and $β ^ B$ of two parameters in model (1) are calculated by the vector $( X ′ V − 1 X ) − 1 X ′ V − 1 y$ where $V = V a r ( y )$ in equation (3). The slope BLUE is written as $β ^ B = ( ϕ ^ S x y a + S x y w ) / ( ϕ ^ S x x a + S x x w )$ where $ϕ ^ = σ ^ E 2 / ( J σ ^ O 2 + σ ^ E 2 )$ and it is normally distributed, i.e. . The intercept BLUE is shown as $α ^ B = Y ¯ .. − β ^ B X ¯ .. .$. The BLUE of the mean response E(Yij ) is therefore obtained by substituting the BLUEs $α ^ B$ and $β ^ B$ for two parameters in model (1), i.e. $Y ^ i j B = α ^ B + β ^ B X i j .$ It can be shown by the assumptions of model (1) that BLUE of E(Yij ) is normally distributed as $Y ^ i j B ~ N ( E ( Y i j ) , ( J σ O 2 + σ E 2 ) [ 1 / I J + ( X i j − X ¯ .. ) 2 ϕ / ( ϕ S x x a + S x x w ) ] ) .$

4.4. Independence of Point Estimators and Sums of Squares

Park and Burdick (1994) showed that the sums of squares RA and RW are jointly independent chi-squared random variables with I − 2 and IJI −1 degrees of freedom, i.e. $R A / ( J σ O 2 + σ E 2 ) ~ χ I − 2 2$ and $R W / σ E 2 ~ χ I J − I − 1 2$. Thus the sum of squares $R B / ( J σ O 2 + σ E 2 )$ is a chisquared random variable with IJ − 3 degrees of freedom, $R B / ( J σ O 2 + σ E 2 ) ~ χ I J − 3 2$ where $R B = R A + R W / ϕ$ since and and RB is a linear combination of two independent chi-squared random variables. In order to construct confidence intervals for mean response E(Yij ) we need to show the independence between estimators, $Y ˜ i j W , Y ^ i j C ,$ and $Y ^ i j B$ and the sums of squares, RA, RW, and RB.

Theorem 1.The within group estimator$Y ˜ i j W$and the sum of squares RA are independent.

Proof: In matrix notation the within group OLSE of mean response is written $Y ˜ i j w = x 1 ( X * ′ X * ) G X * ′ y$ where $x 1 = [ J / I , X i j , 0 , 0 , ⋯ , 0 ]$, i.e., x1 is a (2 + I)×1 vector of J / I, Xij , and the number of 0’s with I as elements. The sum of squares RA in Table 1 is defined as $B = [ D I − M X ( M ′ X M X ) − 1 M ′ X ]$ in matrix notation. The matrix definitions of Yijw and RA are now utilized to show the independence between them. It can be shown by matrix manipulation that since and $x 1 ( X * ′ X * ) G X * ′ A = 0 .$ Thus YijW and RA are independent by Theorem 7.5 of Searle (1987).

Theorem 2. The within group estimator $Y ˜ i j w$ and the sum of squares RW are independent.

Proof: The sum of squares RW is defined RW = y′Wy where $W = D I J − X * ( X * ′ X * ) G X * ′ .$ In order to show independence between $Y ˜ i j w$ and RW matrix definitions are used. It can be shown by matrix manipulation that $x 1 ( X * ′ X * ) G X * ′ × ( σ O 2 Z Z ′ + σ E 2 D I J ) × W = 0$ since $Z Z ′ W = 0$ and $( X * ′ X * ) G X * ′ W = 0 .$ Thus $Y ˜ i j w$ and RW are independent by Theorem 7.5 of Searle (1987).

Theorem 3. The within group estimator $Y ˜ i j w$and the sum of squares RBare independent.

Proof: The sum of squares RB was defined as $R B = J y ′ A y + y ′ W y = y ′ [ J A + 1 / ϕ W ] y$. It can be written that RB = J y′A y + y′Wy = y′[J A+1/ϕ W] y in matrix notation. The within group estimator $Y ˜ i j w$ is therefore independent of the sum of squares RB by Theorems 1 and 2.

Theorem 4.The total estimator$Y ˜ i j C$and the sum of squares RA are independent.

Proof: Total OLSE of mean response is written where $x 2 = [ 1 X i j ]$ in matrix notation. It can be shown by using the matrix definitions of $Y ^ i j C$ and RA that $x 2 ( X ′ X ) − 1 X ′ × ( σ O 2 Z Z ′ + σ E 2 D I J ) × J A = 0$ since . Thus $Y ^ i j C$ and RA are independent by Theorem 7.5 of Searle (1987).

LTheorem 5.The total estimator$Y ^ i j C$and the sum of squaresRWare independent.

Proof: It can be shown by using the matrix definitions of $Y ^ i j C$ and RW that $x 2 ( X ′ X ) − 1 X ′ × ( σ O 2 Z Z ′ + σ E 2 D I J ) × W = 0$ using $Z Z ′ W = 0$ and $X ′ W = 0 .$. Thus $Y ^ i j C$ and RW are independent by Theorem 7.5 of Searle (1987).

Theorem 6.The total estimator$Y ^ i j C$and the sum of squares RBare independent.

Proof: It follows that the total estimator $Y ^ i j C$ and the sum of squares RB are independent by Theorems 4 and 5.

Theorem 7. The best linear estimator $Y ^ i j B$ and the sum of squares RA are independent.

Proof: The matrix definitions of $Y ^ i j B$ and RA are utilized to show the independence between them. In matrix notation the BLUE of E(Yij ) is written as $Y ^ i j B = x 2 ( X ′ V − 1 X ) − 1 X ′ V − 1 y .$ It can be shown that since $V = ( σ O 2 Z Z ′ + σ E 2 D I J )$ and $X ′ A = 0$. Thus $Y ^ i j B$ and RA are independent by Theorem 7.5 of Searle (1987).

Theorem 8.The best linear estimator $Y ^ i j B$and the sum of squares RWare independent.

Proof : It can be shown by using the matrix definitions of $Y ^ i j B$ and RW that $x 2 ( X ′ V − 1 X ) − 1 X ′ V − 1 × V × W = 0$ since $V = ( σ O 2 Z Z ′ + σ E 2 D I J )$ and $X ′ A = 0$. Thus $Y ^ i j B$ and RW are independent by Theorem 7.5 of Searle (1987).

Theorem 9.The best linear estimator$Y ^ i j B$and the sum of squares RB are independent.

Proof: It follows that the best linear estimator $Y ^ i j B$ and the sum of squares RB are independent by Theorems 7 and 8.

We showed the independence between three estimators of mean response and three sums of squares. In summary theorems say that $Y ˜ i j w$ , RA , RW, and RB are jointly independent, YˆijC , RA, RW, and RB are jointly independent, and YˆijB, RA, RW, and RB are jointly independent.

5. CONFIDENCE INTERVALS FOR MEAN RESPONSE

The confidence intervals for mean response are constructed using distributional properties of OLSEs and BLUE of E(Yij ) and independence between the estimators and the sums of squares described in Theorems in Section 4. Since $Y ˜ i j w$ is normally distributed and $Y ˜ i j w$ and RA are independent by Theorem 1, it follows that an exact 100(1−α)% confidence interval for E(Yij ) is

$Y ˜ i j w ± t ( I − 2 ; α / 2 ) S A 2 [ 1 I J + ( X i j − X ¯ .. ) 2 ϕ S x x w ]$
(4)

where $S A 2$ is the error mean square among groups and defined as $S A 2 = R A / ( I − 2 )$ and $t ( υ ; δ )$ is the t-value for ν degrees of freedom with δ area to the right. From two independent chi-squared random variables $R A / ( J σ O 2 + σ E 2 ) ~ χ I − 2 2$ and $R W / σ E 2 ~ χ I J − I − 1 2$ in Section 4.4, the error mean squares and their expected mean squares are obtained as follows:

(5)

$E ( S W 2 ) = σ E 2$
(6)

where $S W 2$ is the error mean square within group and defined as $S W 2 = R W / ( I J − I − 1 ) .$ The unbiased estimators of the variances are respectively

and $σ ^ E 2 = S W 2$ from (5) and (6) and an estimator is thus used to construct a confidence interval for E(Yij). It follows by substituting the unbiased estimator $ϕ ^$ for ϕ in interval (4) that an exact 100(1−α )% confidence interval for E(Yij) is

$Y ˜ i j w ± t ( I − 2 ; α / 2 ) S A 2 [ 1 I J + ( X i j − X ¯ .. ) 2 S W 2 S x x w S A 2 ] .$
(7)

In a similar manner, using the distributional property of $Y ^ i j C$ and independence of $Y ^ i j C$ and RA in Theorem 4, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

(8)

Using the distributional property of $Y ^ i j B$ and independence of $Y ^ i j B$ and RA in Theorem 7, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

$Y ^ i j B ± t ( I − 2 ; α / 2 ) S A 2 I J + S A 2 S W 2 ( X i j − X ¯ .. ) 2 S W 2 S x x a + S A 2 S x x w .$
(9)

On the other hand, one can construct confidence intervals for mean response by using independence between OLSEs and BLUE of E(Yij ) and the sum of squares within group RW described in Theorems in Section 4. Since $Y ˜ i j w$ is normally distributed and $Y ˜ i j w$ and RW are independent by Theorem 2, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

(10)

This is referred to as WG1 method.

Similarly, using the distributional property of $Y ^ i j C$ and independence of $Y ^ i j C$ and RW in Theorem 5, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

(11)

This is referred to as T1 method.

Using the distributional property of $Y ^ i j B$ and independence of $Y ^ i j B$ and RW in Theorem 8, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

(12)

This is referred to as BL1 method.

One can also construct confidence intervals for mean response by using independence between OLSEs and BLUE of E(Yij ) and the sum of squares RB described in Theorems in Section 4. Using the normality of $Y ˜ i j w$ and independence of $Y ˜ i j w$ and RB in Theorem 3, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

$Y ˜ i j w ± t ( I J − 3 ; α / 2 ) S B 2 [ 1 I J + ( X i j − X ¯ .. ) 2 S W 2 S x x w S A 2 ]$
(13)

where $S B 2$ is the error mean square and defined as $S B 2 = R B / ( I J − 3 ) .$. This is referred to as WG2 method.

Using the normality of $Y ^ i j C$ and independence of $Y ^ i j C$ and RB in Theorem 6, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

$Y ^ i j C ± t ( I J − 3 ; α / 2 ) S B 2 [ 1 I J + { r 2 + ( 1 − r 2 ) ( S W 2 / S A 2 ) } ( X i j − X ¯ .. ) 2 ( S x x a + S x x w ) ] .$
(14)

This is referred to as T2 method.

Similarly, using the normality of $Y ^ i j B$ and independence of $Y ^ i j B$ and RB in Theorem 9, it follows that an exact 100(1−α )% confidence interval for E(Yij ) is

$Y ^ i j B ± t ( I J − 3 ; α / 2 ) S B 2 [ 1 I J + S W 2 ( X i j − X ¯ .. ) 2 S W 2 S x x a + S A 2 S x x w ] .$
(15)

This is referred to as BL2 method.

We constructed nine exact confidence intervals for E(Yij ) in this Section. Confidence intervals (7), (8), and (9) use t-values with I − 2 degrees of freedom only whereas confidence intervals (10) to (15) use more degrees of freedom. Thus, confidence intervals (10) to (15) produce narrower interval lengths which are preferred in interval estimation than (7), (8), and (9) do.

6. SIMULATION STUDY

The performance of the confidence intervals proposed in Section 5 is examined by computer simulation. Twenty five designs are formed by taking all combinations of I = 3, 5, 10, 15, 20 and J = 3, 5, 10, 15, 20. The values of $σ O 2$ are selected from the set of values (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) and the values of $σ E 2$ are determined from $σ O 2 + σ E 2 = 1$ without loss of generality. Recall that the mean squares in Section 4 are chisquared random variables. In particular, $S A 2 ~ [ ( J σ O 2 + σ E 2 ) / ( I − 2 ) ] χ ( I − 2 ) 2 ,$ $S W 2 ~ [ σ E 2 / ( I J − I − 1 ) ] χ ( I j − I − 1 ) 2 ,$ and $S B 2 ~ [ ( J σ O 2 + σ E 2 ) / ( I J − 3 ) ] χ ( I J − 3 ) 2 .$ These mean squares are generated by the RANGAM function in SAS (Statistical Analysis System) by substituting the specific values of $σ O 2$ and $σ E 2$ that are selected respectively.

The values of Sxxa are chosen from the set of values (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) and the values of Sxxw are determined from $S x x a + S x x w = 1$ without loss of generality. The OLSEs and BLUE of E(Yij ) are generated by the distributional properties in Section 4. In particular, $Y ˜ i j w ~ N ( E ( Y i j ) , ( J σ O 2 + σ E 2 ) [ 1 / I J + ( X i j − X ¯ .. ) 2 ϕ / S x x w ] ) ,$ and $Y ^ i j B ~ N ( E ( J σ O 2 + σ E 2 ) [ 1 / I J + ( X i j − X ¯ .. ) 2 ϕ / ( ϕ S x x a + S x x w ) ] ) .$2 2 2 $Y ˜ i j w$. These estimators are generated by using RANNOR function SAS by substituting the specific values of Sxxa and Sxxw. The Simulated values of $S A 2 , S W 2 , S B 2 , Y ˜ i j w , Y ^ i j C , and Y ^ i j B$ are substituted into confidence intervals (10) to (15). For each of 25 designs with different combinations of I and J, 2000 iterations are simulated and two sided confidence intervals for E(Yij ) are computed. Confidence coefficients are determined by counting the number of the intervals that contain E(Yij ). The average lengths of the two sided confidence intervals for E(Yij ) are computed.

Using the normal approximation to the binomial, if the true confidence coefficient is 0.90, there is less than a 2.5% chance that a simulated confidence coefficient based on 2000 replications will be less than 0.8868. The comparison criteria are: (i) the ability to maintain the stated confidence coefficient and (ii) the average length of two sided confidence intervals. Although narrower average interval lengths are preferable, it is necessary that an interval first maintain the stated confidence level.

Table 2 reports the range of simulated confidence coefficients for stated 90% two sided confidence intervals on E(Yij) with I = 3, 10, 20 and J = 3, 10, 20 as $σ O 2$ and Sxxa range from 0.1 to 0.9, respectively. The WG1, T1, BL1, WG2, T2, and BL2 methods refer to the intervals (10), (11), (12), (13), (14), and (15), respectively. The WG2, T2, and BL2 methods generally maintain the stated confidence level 0.9 across all combinations of I and J. However, the WG1, T1, and BL1 methods give confidence coefficients that are much below 0.8868 when I = 3. The WG1 method slightly maintains the stated confidence level when I becomes greater than or equal to 10 whereas T1 method is too conservative since the simulated confidence coefficients are very close to 1.0. Thus WG1, T1, and BL1 methods are not generally recommended except WG1 method with I ≥ 10. The simulated confidence coefficients that are less than 0.8868 or abnormally greater than the stated confidence level are shown in boldface in Table 2.

Table 3 reports the range of the average interval lengths for WG1, WG2, T2, and BL2 methods with I = 3, 10, 20 and J = 3, 10, 20. The T1 and BL1 methods are eliminated because they do not generally maintain the simulated confidence level across all combinations of I and J in Table 2. The WG1 method is also eliminated for the same reason when I = 3. The three methods generally yield shorter average interval lengths as I and J increase since the degrees of freedom become large and the standard errors of OLSEs and BLUE of E(Yij ) become small. Although the average interval lengths of three methods change largely depending on the values of $σ O 2 , σ E 2 , S x x a and​ S x x a$, BL2 method generally produces the shortest interval length than T2 and WG2 methods from Table 3.

7. EXAMPLE APPLICATION

One of the manufacturing processes of integrated circuits in semiconductor technology is to connect individual transistors, capacitors, and other circuit elements with a conducting metallic material. These connections are typically built by first depositing a thin blanket layer of metal over the entire silicon wafer and then etching away unnecessary portion. Czitrom and Spagon (1997) presented a data set of a designed experiment in seven factors and two blocks in 38 runs with 14 responses to perform this connection process. Reflectivity (%) as a response variable Yij and resistivity (U-ohm cm) as a predictor variable Xij are selected from the data set and they are shown in Table 4. Three operators (I = 3) are chosen and five measurements (J = 5) are repeatedly conducted assuming a simple nested error regression model (1).

The data set in Table 4 is used to calculate confidence intervals for E(Yij ) . The overall means of reflectivity Y.. and resistivity X.. are respectively computed 12.047% and 87.333 U-ohm cm. Assume that a resistivity value Xij = 85 U-ohm is given to estimate the value of reflectivity Yij. In that case practitioners often consider the variabilities of E(Yij) to see if there is any change in the connection manufacturing process. We suggest confidence intervals for mean response 3. The results E(Yij ) with certain degree of confidence by allowing the bound of errors rather than an estimated value ˆYij when a resistivity value is given. The WG2, T2, and BL2 methods are suggested when resistivity value is 85 because they keep the stated confidence level when I = 3. The results are presented in Table 5. In order to choose an appropriate confidence intervals, one generally prefers the shortest confidence interval that keeps the stated confidence level. The BL2 method yields the shortest interval length among three methods. This result is consistent with simulation study because Table 3 presents the same pattern of average interval lengths when I = 3.

8. CONCLUSIONS

This article presents statistical properties the OLSEs and BLUE of mean response E(Yij ) and independence of estimators and sums of squares appeared in a simple nested error regression model. A numerical example was illustrated to compute confidence intervals for E(Yij ) when a value of predictor variable Xij is given. We first recommend to apply WG2, T2, and BL2 methods to compute a confidence interval for E(Yij ) when I is less then 10. We then suggest to choose the shortest confidence interval.

This article extends the researches of Park and Burdick (1993, 1994) and Park and Hwang (2002). This article proposes several confidence intervals for the mean response in a simple nested error regression model when a measurement is linearly related to a predictor variable in manufacturing processes for gauge R & R study. We specifically present alternative confidence intervals for mean response when an individual value of predictor variable Xij is given rather than calculating the mean value of predictor variable Xi. Future research includes inference concerning expected mean response in a regression model with two-fold nested error structure using various estimation methods.

ACKNOWLEDGMENT

The author would like to thank the anonymous referees for their valuable time and comments on an earlier version of this article. This work was supported by a Research Grant of Pukyong National University (2017 year).

Table

A partition for source of variability of model (1)

Range of simulated confidence coefficients for 90% two sided intervals for E(Yij)

Range of simulated average interval lengths for 90% two sided intervals for E(Yij)

A data set of reflectivity (Yij) and resistivity (Xij)

90% confidence intervals for mean response E(Yij) when Xij=85

REFERENCES

1. M. Aitkin , N.T. Longford (1986) Statistical modeling issues in school effectiveness studies (with Discussion)., J. R. Stat. Soc. [Ser A], Vol.149 (1) ; pp.1-43
2. R.K. Burdick , C.M. Borror , D.C. Montgomery (2005) Design and Analysis of Gauge R & R Studies., SIAM,
3. V. Czitrom , P.D. Spagon (1997) Case Studies for Industrial Process Improvement, SIAM,Philadelphia, V.A.,
4. H. Goldstein , J. Rasbash , M. Yang , G. Woodhouse , H. Pan , D. Nuttall , S. Thomas (1993) A multilevelanalysis of school examination results., Oxf. Rev. Educ., Vol.19 (4) ; pp.425-433
5. H. Goldstein (2011) Multilevel Statistical Models., John Wiley & Sons,
6. F.L. Huang (2016) Alternatives to multilevel modeling for the analysis of clustered data., J. Exp. Educ., Vol.84 (1) ; pp.175-196
7. F.L. Huang (2017) Multilevel modeling and ordinary least squares regression: How comparable are they? The Journal of Experimental Education., Publishedonline, Vol.22 (Feb) ; pp.265-281
8. N.T. Longford (1986) A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects., Biometrika, Vol.74 (4) ; pp.817-827
9. N.T. Longford (1995) Random Coefficient Models, edited in Handbook of Statistical modeling for the Social and Behavioral Sciences., Plenum Press, ; pp.519-578
10. C.J.M. Mass , J.J. Hox (2004) The influence of violations of assumptions on multilevel parameter estimates and their standard errors., Comput. Stat. Data Anal., Vol.46 (3) ; pp.427-440
11. D.M. McNeish , L.M. Stapleton (2016) The effect of small sample size on two-level model estimates: A review and illustration., Educ. Psychol. Rev., Vol.28 (2) ; pp.295-314
12. D.J. Park , R.K. Burdick (1993) Confidence intervals on the among group variance component in a simple linear regression model with a balanced onefold nested error structure., Commun. Stat. Theory Methods, Vol.22 (12) ; pp.3435-3452
13. D.J. Park , R.K. Burdick (1994) Confidence intervals on the regression coefficient in a simple linear regression model with a balanced one-fold nested error structure., Commun. Stat. Simul. Comput., Vol.23 (1) ; pp.43-58
14. D.J. Park , H.M. Hwang (2002) Confidence intervals for the mean response in the simple linear regression model with balanced nested error structure., Commun. Stat. Theory Methods, Vol.31 (1) ; pp.107-117
15. D.J. Park (2013) Alternative confidence intervals on variance components in a simple regression model with a balanced two-fold nested error structure., Commun. Stat. Theory Methods, Vol.42 (13) ; pp.2281-2291
16. D.J. Park , M. Yoon (2016) Confidence intervals for the regression coefficient in a simple regression model with a balanced two-fold nested error structure., Commun. Stat. Theory Methods, Vol.45 (17) ; pp.5053-5065
17. S.W. Raudenbush (1995) Maximum likelihood estimation for unbalanced multilevel covariance structure models via the EM algorithm., Br. J. Math. Stat. Psychol., Vol.48 (2) ; pp.359-370
18. S.R. Searle (1987) Linear Models for Unbalanced Data., John Wiley & Sons, Inc.,