## 1. INTRODUCTION

A gauge is employed to collect replicated measurements on units by several different operators, setups, or time periods in measurement system. The measurements in manufacturing processes include variability due to parts, operators, and gauges. The data collected during measuring processes are used to monitor to decrease the variability generated in production. When a measurement is linearly related to a predictor variable and a simple nested error regression model is applied for the measurement system, one might be interested in making inferences concerning the variability for the mean response in the model. The simple nested regression model considered in this article is a simple regression model with nested error structure where the response variable is linearly related to a predictor variable in manufacturing processes for gauge R & R study. Burdick *et al*. (2005) define that repeatability represents the gauge variability when it is used to measure the same unit (with the same operator or the same setup or in the same time period). Reproducibility refers to the variability arising from different operators, setups, or time periods. Measurement system capability studies are often referred to as gauge repeatability and reproducibility (R & R) study. The sources of variability in gauge R & R study are expressed as variance components so that confidence intervals for the variance components are employed to determine the adequacy of manufacturing for gauge R & R study.

This article focuses on confidence intervals for the mean response in a simple nested regression model which models a manufacturing process. This simple nested regression model includes two error terms. One is associated with the first stage sampling unit which is an operator effect and the other with second stage sampling unit which is a measurement error. The two error terms are assumed to be independent and normally distributed with zero means and constant variances. This article derives several confidence intervals for the mean response in the simple nested regression model.

## 2. LITERATURE REVIEW

A simple nested error regression model applying for a manufacturing process in this article can be viewed as a statistical model for clustered observations where subjects are nested within groups in the field of educational or biological sciences. Nested data structures are easily found in two types of structures: repeated measures within individuals or students within schools. The data structures are commonly appeared when multi-stage sampling procedures are taken from a target population. The data with this structure are written as nested error structure models, hierarchical linear models, or multilevel models. Observations within clusters tend to be more homogeneous than those between clusters. Misuse of standard ordinary least squares regression when clustered data exist violates the assumptions of data structures.

A great deal of research on statistical models with hierarchically structured data has been conducting by researchers (Goldstein, 2011; Raudenbush, 1995; Longford, 1995). They started with the simplest multilevel model with each group’s intercept, ${Y}_{ij}={\alpha}_{j}+{\beta}_{j}{X}_{ij}+{E}_{ij}$, which is at level 1. They then introduce level 2 random variables into the simplest multilevel model by replacing *α _{j}* by ${\beta}_{0j}={\beta}_{0}+{O}_{0j}$ and

*β*by ${\beta}_{1j}={\beta}_{1}+{O}_{1j}$. The level 1 variables are observations within clusters and level 2 variables observations between clusters. Goldstein (2011) referred to this model as a 2-level model with hierarchically structured data. This model becomes the simple nested error regression model in this article by letting

_{j}*β*

_{0},

*O*

_{0j}and

*O*

_{1j}be

*α*,

*O*and 0 respectively. We assumed that a predictor variable

_{i}*X*and the regression coefficients

_{ij}*α*and

*β*are fixed factors and

*O*and

_{i}*E*are random factors where ${O}_{i}~N(0,\hspace{0.17em}{\sigma}_{O}^{2})$ and ${E}_{ij}~N(0,\hspace{0.17em}{\sigma}_{E}^{2})$ in the model.

_{ij}Parameters, *α* and , *β* and variance components, ${\sigma}_{O}^{2}$ and ${\sigma}_{E}^{2}$ in multilevel models have been estimated by applying for variance component models, random slopes models, ordinary regression models, etc. (Huang, 2016; McNeish and Stapleton, 2016; Mass and Hox, 2004; Aitkin and Longford, 1986). They have also compared the results of estimation methods such as maximum likelihood estimation, restricted maximum likelihood estimation, ordinary least squares estimation, or robust estimation, depending on various situations and assumptions of multilevel models. Although maximum likelihood estimation and ordinary least squares estimation are not generalized to all situations, using ordinary least estimation is worth considering due to its parsimonious nature (Huang, 2017). The ordinary least squares regression that ignores the nested error structure can lead to substantially biased estimates of regression estimators and their standard errors (Goldstein *et al*., 1993; Longford, 1986). Their research on multilevel models has mostly concentrated on Point estimation of the parameters, bias of standard errors, and statistical significance.

Park and Burdick (1993, 1994) provided confidence intervals for regression coefficients and variance components in the regression model with one-fold nested regression model. Park (2013) and Park and Yoon (2016) conducted further study to propose confidence intervals for regression coefficients and variance components in a regression model with two-fold nested error structure using various estimation methods. To our knowledge, the inference concerning expected mean response in the nested error regression model for a gauge R & R study has not actively researched up until now. We make, therefore, an attempt to provide confidence intervals for mean response in the model by extending the researches of Park and Burdick (1993, 1994) and Park and Hwang (2002). We specifically aim to present alternative confidence intervals for mean response by relaxing the condition for the mean value of predictor variable, i.e., when an individual value of predictor variable *X _{ij}* is given rather than calculating the mean value of predictor variable

*X*.

_{i}## 3. MODEL STATEMENT

Consider that a manufacturing process where parts are measured by operators and the response variable as a measurement is linearly related to a predictor variable. The simple nested error regression model is then written as

where *Y _{ij}* is the

*j*th measurement of a part measured by the

*i*th operator as a response variable,

*α*and

*β*are regression parameters,

*X*is a fixed predictor variable related to the response variable

_{ij}*Y*,

_{ij}*O*is the

_{i}*i*th randomly chosen operator effect, and

*E*is the

_{ij}*j*th measurement error of a part measured by the

*i*th operator. Operator effect

*O*is associated with the first stage sampling unit and measurement error effect

_{i}*E*with second stage sampling unit. Two error terms

_{ij}*O*and

_{i}*E*are jointly independent normal random variables with zero means and variances ${\sigma}_{O}^{2}$ and ${\sigma}_{P}^{2}$, respectively. That is, ${O}_{i}~N(0,\hspace{0.17em}{\sigma}_{O}^{2})$ and ${E}_{ij}~N(0,\hspace{0.17em}{\sigma}_{E}^{2})$. Since

_{ij}*α*,

*β*, and

*X*are fixed factors and

_{ij}*O*and

_{i}*E*are random factors, equation (1) is a mixed model.

_{ij}The model (1) is written in matrix notation(2)

where ** y** is an

*IJ*×1 vector of measurements,

**is an**

*X**IJ*× 2 matrix of predictor variables with a column of 1’s in the first column and a column of

*X*′s in the second column,

_{ij}**is a 2×1 vector of parameters with**

*β**α*and

*β*as elements,

**Z**is an

*IJ*×

*I*design matrix with 0’s and 1’s, i.e. $\mathbf{\text{Z}}={\oplus}_{i=1}^{I}{\mathbf{1}}_{J},\hspace{0.17em}{\mathbf{1}}_{J}$ is a vector of 1’s,

**is an**

*o**I*× 1 vector of operator effects,

**is an**

*e**IJ*× 1 vector of measurement errors. By the assumptions in model (1) the response variables have a multivariate normal distribution as follows:

where $\mathbf{\text{V}}={\sigma}_{O}^{2}\mathbf{\text{Z}}{\mathbf{Z}}^{\prime}+{\sigma}_{E}^{2}\hspace{0.17em}{\mathbf{\text{D}}}_{IJ}$ and **D**_{IJ} is an *IJ* × *IJ* identity matrix.

A possible partitioning for source of variability of model (1) is written in Table 1. The sums of squares in Table 1 is defined as follows: ${R}_{A}=\text{}{S}_{yya}-{\widehat{\beta}}_{A}^{2}\text{}{S}_{xxa}$ and ${R}_{L}={\widehat{\beta}}_{A}^{2}\text{}{S}_{xxa}+{\widehat{\beta}}_{W}^{2}\text{}{S}_{xxw}-{R}_{1}$ where ${S}_{xxa}=J{\text{\Sigma}}_{i}{({\overline{X}}_{i.}-{\overline{X}}_{\mathrm{..}})}^{2},$
${S}_{yya}=J{\text{\Sigma}}_{i}{({\overline{Y}}_{i.}-{\overline{Y}}_{\mathrm{..}})}^{2},$
${S}_{xya}=J{\text{\Sigma}}_{i}\hspace{0.17em}({\overline{X}}_{i.}-{\overline{X}}_{\mathrm{..}})\hspace{0.17em}({\overline{Y}}_{i.}-{\overline{Y}}_{\mathrm{..}}),$
${S}_{xxw}{\text{=\Sigma}}_{i}{\text{\Sigma}}_{j}{({X}_{ij}-{\overline{X}}_{i.})}^{2},$
${S}_{yyw}={\text{\Sigma}}_{i}{\text{\Sigma}}_{j}\hspace{0.17em}{({Y}_{ij}-{\overline{Y}}_{i.})}^{2},$
${S}_{xyw}={\text{\Sigma}}_{i}{\text{\Sigma}}_{j}({X}_{ij}-{\overline{X}}_{i.})({Y}_{ij}-{\overline{Y}}_{i.}),$
${\overline{X}}_{i.}={\text{\Sigma}}_{j}\text{}{X}_{ij}/J,$
${\overline{Y}}_{i.}={\text{\Sigma}}_{j}\hspace{0.17em}{Y}_{ij}/J,$
${\overline{X}}_{\mathrm{..}}={\text{\Sigma}}_{i}{\text{\Sigma}}_{j}{X}_{ij}/IJ,$ and ${\overline{Y}}_{\mathrm{..}}={\text{\Sigma}}_{i}{\text{\Sigma}}_{j}{Y}_{ij}/IJ.$ Three estimators of regression coefficient *β* used in Table 1 are as follows: ${\widehat{\beta}}_{c}=({S}_{xya}+{S}_{xyw})\hspace{0.17em}/\hspace{0.17em}({S}_{xxa}+{S}_{xxw}),$
${\widehat{\beta}}_{A}={S}_{xya}\hspace{0.17em}/\hspace{0.17em}{S}_{xxa},$ and ${\widehat{\beta}}_{W}={S}_{xyw}/{S}_{xxw}$.

## 4. DISTRIBUTIONAL RESULTS OF ESTIMATORS OF MEAN RESPONSE AND SUMS OF SQUARES

In order to construct confidence intervals for the mean response *E*(*Y _{ij}* ) , the estimators of mean response and their distributional properties are derived. Three estimators of the mean response are presented. The independence between estimators of mean response and sums of squares in Table 1 is examined.

### 4.1. Within Group Estimator of Mean Response

The mean response of the simple nested error regression model (1) is defined as $E({Y}_{ij})=\alpha +\beta {X}_{ij}$ where *E*(*Y _{ij}* ) means the expected value of the

*j*th measurement of a part measured by the

*i*th operator. Park and Burdick (1993) showed that within group ordinary least squares estimators (OLSE) ${\widehat{\alpha}}_{W}$ and ${\widehat{\beta}}_{W}$ of parameters in model (1) are obtained from regression of

*Y*on the

_{ij}*X*and the grouping variables. In matrix notation, the estimators are the first two elements of the vector ${({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}y$ where ${X}^{*}=[X\hspace{0.17em}Z]$ and ${({X}^{{*}^{\prime}}{X}^{*})}^{G}$ is a generalized inverse of ${X}^{{*}^{\prime}}{X}^{*}.$. The within group slope OLSE is shown as ${\widehat{\beta}}_{W}={S}_{xyw}/{S}_{xxw}$ and it is normally distributed, i.e. ${\widehat{\beta}}_{W}~N(\beta ,\hspace{0.17em}{\sigma}_{E}^{2}/{S}_{xxw}).$ It can be shown by elementary algebra that the within group intercept OLSE is shown as ${\tilde{\alpha}}_{W}=J{\widehat{\alpha}}_{W}/I={\overline{Y}}_{\mathrm{..}}-{\widehat{\beta}}_{W}{\overline{X}}_{\mathrm{..}}$ Since an unbiased estimator of

_{ij}*E*(

*Y*) is ${\widehat{Y}}_{ij}$, the within group OLSE of the mean response

_{ij}*E*(

*Y*) is obtained by substituting within group OLSEs ${\tilde{\alpha}}_{W}$ and ${\widehat{\beta}}_{W}$ for parameters in model (1), i.e. ${\tilde{Y}}_{ijW}={\tilde{\alpha}}_{W}+{\widehat{\beta}}_{W}{X}_{ij}$. It can be shown by the assumptions of model (1) that within group OLSE of

_{ij}*E*(

*Y*) is normally distributed as ${\tilde{Y}}_{ijw}~N(E({Y}_{ij}),\hspace{0.17em}(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})[1/IJ+{({X}_{ij}-{\overline{X}}_{\mathrm{..}})}^{2}\varphi /{S}_{xxw}])$ where $\varphi ={\sigma}_{E}^{2}/(J\hspace{0.17em}{\sigma}_{O}^{2}+{\sigma}_{E}^{2})$.

_{ij}### 4.2. Total Estimator of Mean Response

Total OLSEs ${\widehat{\alpha}}_{C}$ and ˆ*β*C of parameters in model (1) are obtained from regression of *Y _{ij}* on

*X*. Park and Burdick (1993) showed that the total OLSEs are obtained by the vector ${({X}^{\prime}\hspace{0.17em}X)}^{-1}{X}^{\prime}y$. The total slope OLSE is written as ${\widehat{\beta}}_{C}=({S}_{xya}+{S}_{xyw})/({S}_{xxa}+{S}_{xxw})$ and it is normally distributed, i.e. ${\widehat{\beta}}_{C}=({S}_{xya}+{S}_{xyw})/({S}_{xxa}+{S}_{xxw})$ where ${r}^{2}={S}_{xxa}/({S}_{xxa}+{S}_{xxw})$. The total intercept OLSE is shown to be ${\widehat{\alpha}}_{C}={\overline{Y}}_{\mathrm{..}}-{\widehat{\beta}}_{C}\hspace{0.17em}{\overline{X}}_{\mathrm{..}}.$ The total OLSE of the mean response

_{ij}*E*(

*Y*) is therefore obtained by substituting total OLSEs ${\widehat{\alpha}}_{C}$ and ${\widehat{\beta}}_{C}$ for parameters in model (1), i.e. ${\widehat{Y}}_{ijC}={\widehat{\alpha}}_{C}+{\widehat{\beta}}_{C}{X}_{ij}$. It can be shown by the assumptions of model (1) that total OLSE of

_{ij}*E*(

*Y*) is normally distributed as ${\widehat{Y}}_{ijC}~N(E({Y}_{ij}),\hspace{0.17em}\hspace{0.17em}(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})[1/IJ+({r}^{2}J{\sigma}_{O}^{2}+{\sigma}_{E}^{2}){({X}_{ij}-{\overline{X}}_{\mathrm{..}})}^{2}/\{(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})({S}_{xxa}+{S}_{xxw})\}]).$

_{ij}### 4.3. Best Linear Estimator of Mean Response

Park and Burdick (1994) derived among groups OLSEs ${\widehat{\alpha}}_{A}$ and ${\widehat{\beta}}_{A}$ of parameters in model (1) by regression of ${\overline{Y}}_{i.}$ on ${\overline{X}}_{i.}$. and the among groups OLSEs are obtained by the vector ${({{M}^{\prime}}_{X}\hspace{0.17em}{M}_{X})}^{-1}\hspace{0.17em}{{M}^{\prime}}_{X}\hspace{0.17em}y$ where ${M}_{X}=MX$ and $M=(1/J)Z.$ They showed that the best linear unbiased estimator (BLUE) ${\widehat{\beta}}_{B}$ of slope parameter in model (1) is a linear combination of ${\widehat{\beta}}_{A}$ and ${\widehat{\beta}}_{W}$. It can be shown by matrix algebra that the BLUEs ${\widehat{\alpha}}_{B}$ and ${\widehat{\beta}}_{B}$ of two parameters in model (1) are calculated by the vector ${({X}^{\prime}{V}^{-1}X)}^{-1}{X}^{\prime}{V}^{-1}y$ where $V=Var(y)$ in equation (3). The slope BLUE is written as ${\widehat{\beta}}_{B}=(\widehat{\varphi}{S}_{xya}+{S}_{xyw})/(\widehat{\varphi}{S}_{xxa}+{S}_{xxw})$ where $\widehat{\varphi}={\widehat{\sigma}}_{E}^{2}/(J{\widehat{\sigma}}_{O}^{2}+{\widehat{\sigma}}_{E}^{2})$ and it is normally distributed, i.e. ${\widehat{\beta}}_{B}~N(\beta ,\hspace{0.17em}{\sigma}_{E}^{2}/(\varphi {S}_{xxa}+{S}_{xxw})).$. The intercept BLUE is shown as ${\widehat{\alpha}}_{B}={\overline{Y}}_{\mathrm{..}}-{\widehat{\beta}}_{B}{\overline{X}}_{\mathrm{..}}.$. The BLUE of the mean response *E*(*Y _{ij}* ) is therefore obtained by substituting the BLUEs ${\widehat{\alpha}}_{B}$ and ${\widehat{\beta}}_{B}$ for two parameters in model (1), i.e. ${\widehat{Y}}_{ijB}={\widehat{\alpha}}_{B}+{\widehat{\beta}}_{B}{X}_{ij}.$ It can be shown by the assumptions of model (1) that BLUE of

*E*(

*Y*) is normally distributed as ${\widehat{Y}}_{ijB}~N(E({Y}_{ij}),\hspace{0.17em}(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})[1/IJ+{({X}_{ij}-{\overline{X}}_{\mathrm{..}})}^{2}\varphi /(\varphi {S}_{xxa}+{S}_{xxw})]).$

_{ij}### 4.4. Independence of Point Estimators and Sums of Squares

Park and Burdick (1994) showed that the sums of squares *R _{A}* and

*R*are jointly independent chi-squared random variables with

_{W}*I*− 2 and

*IJ*−

*I*−1 degrees of freedom, i.e. ${R}_{A}/(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})~{\chi}_{I-2}^{2}$ and ${R}_{W}/{\sigma}_{E}^{2}~{\chi}_{IJ-I-1}^{2}$. Thus the sum of squares ${R}_{B}/(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})$ is a chisquared random variable with

*IJ*− 3 degrees of freedom, ${R}_{B}/(J\hspace{0.17em}{\sigma}_{O}^{2}+{\sigma}_{E}^{2})~{\chi}_{IJ-3}^{2}$ where ${R}_{B}={R}_{A}+{R}_{W}/\varphi $ since ${R}_{A}\sim (J{\sigma}_{O}^{2}+{\sigma}_{E}^{2}){\chi}_{I-2}^{2}$ and ${R}_{W}/\varphi ~(J\hspace{0.17em}{\sigma}_{O}^{2}+{\sigma}_{E}^{2}){\chi}_{IJ-I-1}^{2}$ and

*R*is a linear combination of two independent chi-squared random variables. In order to construct confidence intervals for mean response

_{B}*E*(

*Y*) we need to show the independence between estimators, ${\tilde{Y}}_{ijW},\hspace{0.17em}\hspace{0.17em}{\widehat{Y}}_{ijC},$ and ${\widehat{Y}}_{ijB}$ and the sums of squares,

_{ij}*R*,

_{A}*R*, and

_{W}*R*.

_{B}**Theorem 1.***The within group estimator*${\tilde{Y}}_{ijW}$*and the sum of squares R _{A} are independent*.

**Proof:** In matrix notation the within group OLSE of mean response is written ${\tilde{Y}}_{ijw}={x}_{1}{({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}y$ where ${x}_{1}=[J/I,\hspace{0.17em}\hspace{0.17em}{X}_{ij},\hspace{0.17em}0,\hspace{0.17em}0,\cdots ,\hspace{0.17em}0]$, i.e., **x**_{1} is a (2 + *I*)×1 vector of *J* / *I*, *X _{ij}* , and the number of 0’s with

*I*as elements. The sum of squares

*R*in Table 1 is defined as $B=[{D}_{I}-{M}_{X}{({{M}^{\prime}}_{X}{M}_{X})}^{-1}{{M}^{\prime}}_{X}]$ in matrix notation. The matrix definitions of

_{A}*Y*and

_{ijw}*R*are now utilized to show the independence between them. It can be shown by matrix manipulation that ${x}_{1}{({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}\times ({\sigma}_{O}^{2}Z{Z}^{\prime}+{\sigma}_{E}^{2}\hspace{0.17em}{D}_{IJ})\times JA=0$ since $Z{Z}^{\prime}A=JA$ and ${x}_{1}{({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}A=0.$ Thus

_{A}*Y*and

_{ijW}*R*are independent by Theorem 7.5 of Searle (1987).

_{A}Theorem 2. The within group estimator ${\tilde{Y}}_{ijw}$ and the sum of squares *R _{W}* are independent.

**Proof:** The sum of squares *R _{W}* is defined

*R*= y′Wy where $W={D}_{IJ}-{X}^{*}{({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}.$ In order to show independence between ${\tilde{Y}}_{ijw}$ and

_{W}*R*matrix definitions are used. It can be shown by matrix manipulation that ${x}_{1}{({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}\times ({\sigma}_{O}^{2}Z{Z}^{\prime}+{\sigma}_{E}^{2}\hspace{0.17em}{D}_{IJ})\times W=0$ since $Z{Z}^{\prime}W=0$ and ${({X}^{{*}^{\prime}}{X}^{*})}^{G}{X}^{{*}^{\prime}}W=0.$ Thus ${\tilde{Y}}_{ijw}$ and

_{W}*R*are independent by Theorem 7.5 of Searle (1987).

_{W}* Theorem 3. The within group estimator *${\tilde{Y}}_{ijw}$

*and the sum of squares*

*R*

_{B}*are independent*.

**Proof:** The sum of squares *R _{B}* was defined as ${R}_{B}=J\hspace{0.17em}{y}^{\prime}A\hspace{0.17em}y+{y}^{\prime}W\hspace{0.17em}y={y}^{\prime}[J\hspace{0.17em}A+1/\varphi \hspace{0.17em}W]\hspace{0.17em}y$. It can be written that

*R*= J y′A y + y′Wy = y′[J A+1/

_{B}*ϕ*W] y in matrix notation. The within group estimator ${\tilde{Y}}_{ijw}$ is therefore independent of the sum of squares

*R*by Theorems 1 and 2.

_{B}**Theorem 4.***The total estimator*${\tilde{Y}}_{ijC}$*and the sum of squares R _{A} are independent*.

**Proof:** Total OLSE of mean response is written ${\widehat{Y}}_{ijC}={x}_{2}{({X}^{\prime}\text{\hspace{0.05em}}X)}^{-1}{X}^{\prime}y$ where ${x}_{2}=[1\hspace{0.17em}{X}_{ij}]$ in matrix notation. It can be shown by using the matrix definitions of ${\widehat{Y}}_{ijC}$ and *R _{A}* that ${x}_{2}{({X}^{\prime}\hspace{0.17em}X)}^{-1}{X}^{\prime}\times ({\sigma}_{O}^{2}Z{Z}^{\prime}+{\sigma}_{E}^{2}\hspace{0.17em}{D}_{IJ})\times J\hspace{0.17em}A=0$ since ${R}_{A}=J{y}^{\prime}Ay,\hspace{0.17em}Z{Z}^{\prime}A=JA,\hspace{0.17em}\text{and}\hspace{0.17em}{X}^{\prime}A=0.$. Thus ${\widehat{Y}}_{ijC}$ and

*R*are independent by Theorem 7.5 of Searle (1987).

_{A}**LTheorem 5.***The total estimator*${\widehat{Y}}_{ijC}$*and the sum of squares**R _{W}*

*are independent*.

**Proof:** It can be shown by using the matrix definitions of ${\widehat{Y}}_{ijC}$ and *R _{W}* that ${x}_{2}{({X}^{\prime}\hspace{0.17em}X)}^{-1}{X}^{\prime}\times ({\sigma}_{O}^{2}Z{Z}^{\prime}+{\sigma}_{E}^{2}\text{\hspace{0.05em}}{D}_{IJ})\times W=0$ using $Z{Z}^{\prime}W=0$ and ${X}^{\prime}W=0.$. Thus ${\widehat{Y}}_{ijC}$ and

*R*are independent by Theorem 7.5 of Searle (1987).

_{W}**Theorem 6.***The total estimator*${\widehat{Y}}_{ijC}$*and the sum of squares **R _{B}*

*are independent*.

**Proof:** It follows that the total estimator ${\widehat{Y}}_{ijC}$ and the sum of squares *R _{B}* are independent by Theorems 4 and 5.

**Theorem 7.** The best linear estimator ${\widehat{Y}}_{ijB}$ and the sum of squares *R _{A}* are independent.

**Proof:** The matrix definitions of ${\widehat{Y}}_{ijB}$ and *R _{A}* are utilized to show the independence between them. In matrix notation the BLUE of

*E*(

*Y*) is written as ${\widehat{Y}}_{ijB}={x}_{2}{({X}^{\prime}{V}^{-1}X)}^{-1}{X}^{\prime}{V}^{-1}y.$ It can be shown that ${x}_{2}{({X}^{\prime}{V}^{-1}X)}^{-1}{X}^{\prime}{V}^{-1}\times V\times JA=0$ since $V=({\sigma}_{O}^{2}Z{Z}^{\prime}+{\sigma}_{E}^{2}\text{\hspace{0.05em}}{D}_{IJ})$ and ${X}^{\prime}A=0$. Thus ${\widehat{Y}}_{ijB}$ and

_{ij}*R*are independent by Theorem 7.5 of Searle (1987).

_{A}**Theorem 8.***The best linear estimator *${\widehat{Y}}_{ijB}$*and the sum of squares **R _{W}*

*are independent*.

**Proof :** It can be shown by using the matrix definitions of ${\widehat{Y}}_{ijB}$ and *R _{W}* that ${x}_{2}{({X}^{\prime}{V}^{-1}X)}^{-1}{X}^{\prime}{V}^{-1}\times V\times W=0$ since $V=({\sigma}_{O}^{2}Z{Z}^{\prime}+{\sigma}_{E}^{2}\text{\hspace{0.05em}}{D}_{IJ})$ and ${X}^{\prime}A=0$. Thus ${\widehat{Y}}_{ijB}$ and

*R*are independent by Theorem 7.5 of Searle (1987).

_{W}**Theorem 9.***The best linear estimator*${\widehat{Y}}_{ijB}$*and the sum of squares R _{B} are independent*.

**Proof:** It follows that the best linear estimator ${\widehat{Y}}_{ijB}$ and the sum of squares *R _{B}* are independent by Theorems 7 and 8.

We showed the independence between three estimators of mean response and three sums of squares. In summary theorems say that ${\tilde{Y}}_{ijw}$ , *R _{A}* ,

*R*, and

_{W}*R*are jointly independent, YˆijC ,

_{B}*R*,

_{A}*R*, and

_{W}*R*are jointly independent, and YˆijB,

_{B}*R*,

_{A}*R*, and

_{W}*R*are jointly independent.

_{B}## 5. CONFIDENCE INTERVALS FOR MEAN RESPONSE

The confidence intervals for mean response are constructed using distributional properties of OLSEs and BLUE of *E*(*Y _{ij}* ) and independence between the estimators and the sums of squares described in Theorems in Section 4. Since ${\tilde{Y}}_{ijw}$ is normally distributed and ${\tilde{Y}}_{ijw}$ and

*R*are independent by Theorem 1, it follows that an exact 100(1−

_{A}*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

where ${S}_{A}^{2}$ is the error mean square among groups and defined as ${S}_{A}^{2}={R}_{A}/(I-2)$ and ${t}_{(\upsilon ;\delta )}$ is the *t*-value for *ν* degrees of freedom with *δ* area to the right. From two independent chi-squared random variables ${R}_{A}/(J\hspace{0.17em}{\sigma}_{O}^{2}+{\sigma}_{E}^{2})~{\chi}_{I-2}^{2}$ and ${R}_{W}/{\sigma}_{E}^{2}~{\chi}_{IJ-I-1}^{2}$ in Section 4.4, the error mean squares and their expected mean squares are obtained as follows:

where ${S}_{W}^{2}$ is the error mean square within group and defined as ${S}_{W}^{2}={R}_{W}/(IJ-I-1).$ The unbiased estimators of the variances are respectively

and ${\widehat{\sigma}}_{E}^{2}={S}_{W}^{2}$ from (5) and (6) and an estimator $\widehat{\varphi}={S}_{W}^{2}/{S}_{A}^{2}$ is thus used to construct a confidence interval for*E*(

*Y*). It follows by substituting the unbiased estimator $\widehat{\varphi}$ for

_{ij}*ϕ*in interval (4) that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

In a similar manner, using the distributional property of ${\widehat{Y}}_{ijC}$ and independence of ${\widehat{Y}}_{ijC}$ and *R _{A}* in Theorem 4, it follows that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

Using the distributional property of ${\widehat{Y}}_{ijB}$ and independence of ${\widehat{Y}}_{ijB}$ and *R _{A}* in Theorem 7, it follows that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

On the other hand, one can construct confidence intervals for mean response by using independence between OLSEs and BLUE of *E*(*Y _{ij}* ) and the sum of squares within group

*R*described in Theorems in Section 4. Since ${\tilde{Y}}_{ijw}$ is normally distributed and ${\tilde{Y}}_{ijw}$ and

_{W}*R*are independent by Theorem 2, it follows that an exact 100(1−

_{W}*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

This is referred to as WG1 method.

Similarly, using the distributional property of ${\widehat{Y}}_{ijC}$ and independence of ${\widehat{Y}}_{ijC}$ and *R _{W}* in Theorem 5, it follows that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

This is referred to as T1 method.

Using the distributional property of ${\widehat{Y}}_{ijB}$ and independence of ${\widehat{Y}}_{ijB}$ and *R _{W}* in Theorem 8, it follows that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

This is referred to as BL1 method.

One can also construct confidence intervals for mean response by using independence between OLSEs and BLUE of *E*(*Y _{ij}* ) and the sum of squares

*R*described in Theorems in Section 4. Using the normality of ${\tilde{Y}}_{ijw}$ and independence of ${\tilde{Y}}_{ijw}$ and

_{B}*R*in Theorem 3, it follows that an exact 100(1−

_{B}*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

where ${S}_{B}^{2}$ is the error mean square and defined as ${S}_{B}^{2}={R}_{B}/(IJ-3).$. This is referred to as WG2 method.

Using the normality of ${\widehat{Y}}_{ijC}$ and independence of ${\widehat{Y}}_{ijC}$ and *R _{B}* in Theorem 6, it follows that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

This is referred to as T2 method.

Similarly, using the normality of ${\widehat{Y}}_{ijB}$ and independence of ${\widehat{Y}}_{ijB}$ and *R _{B}* in Theorem 9, it follows that an exact 100(1−

*α*)% confidence interval for

*E*(

*Y*) is

_{ij}

This is referred to as BL2 method.

We constructed nine exact confidence intervals for *E*(*Y _{ij}* ) in this Section. Confidence intervals (7), (8), and (9) use

*t*-values with

*I*− 2 degrees of freedom only whereas confidence intervals (10) to (15) use more degrees of freedom. Thus, confidence intervals (10) to (15) produce narrower interval lengths which are preferred in interval estimation than (7), (8), and (9) do.

## 6. SIMULATION STUDY

The performance of the confidence intervals proposed in Section 5 is examined by computer simulation. Twenty five designs are formed by taking all combinations of *I* = 3, 5, 10, 15, 20 and *J* = 3, 5, 10, 15, 20. The values of ${\sigma}_{O}^{2}$ are selected from the set of values (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) and the values of ${\sigma}_{E}^{2}$ are determined from ${\sigma}_{O}^{2}+{\sigma}_{E}^{2}=1$ without loss of generality. Recall that the mean squares in Section 4 are chisquared random variables. In particular, ${S}_{A}^{2}~[(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})/(I-2)]{\chi}_{(I-2)}^{2},$
${S}_{W}^{2}~[{\sigma}_{E}^{2}/(IJ-I-1)]\hspace{0.17em}{\chi}_{(Ij-I-1)}^{2},$ and ${S}_{B}^{2}~[(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})/(IJ-3)]\hspace{0.17em}{\chi}_{(IJ-3)}^{2}.$ These mean squares are generated by the *R _{A}*NGAM function in SAS (Statistical Analysis System) by substituting the specific values of ${\sigma}_{O}^{2}$ and ${\sigma}_{E}^{2}$ that are selected respectively.

The values of *S _{xxa}* are chosen from the set of values (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) and the values of

*S*are determined from ${S}_{xxa}+{S}_{xxw}=1$ without loss of generality. The OLSEs and BLUE of

_{xxw}*E*(

*Y*) are generated by the distributional properties in Section 4. In particular, ${\tilde{Y}}_{ijw}~N(E({Y}_{ij}),\hspace{0.17em}(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})[1/IJ+{({X}_{ij}-{\overline{X}}_{\mathrm{..}})}^{2}\varphi /{S}_{xxw}]),$ and ${\widehat{Y}}_{ijC}~N(E({Y}_{ij}),\hspace{0.17em}\hspace{0.17em}(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})[1/IJ+({r}^{2}J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})\hspace{0.17em}{({X}_{ij}-{\overline{X}}_{\mathrm{..}})}^{2}/\{(J{\sigma}_{O}^{2}+{\sigma}_{E}^{2})\hspace{0.17em}({S}_{xxa}+{S}_{xxw})\}]),$ ${\widehat{Y}}_{ijB}~N(E(J{\sigma}_{O}^{2}\hspace{0.17em}+\hspace{0.17em}{\sigma}_{E}^{2})\hspace{0.17em}\hspace{0.17em}[1/IJ\hspace{0.17em}+\hspace{0.17em}{({X}_{ij}-{\overline{X}}_{\mathrm{..}})}^{2}\varphi /(\varphi {S}_{xxa}+{S}_{xxw})]).$2 2 2 ${\tilde{Y}}_{ijw}$. These estimators are generated by using

_{ij}*R*NNOR function SAS by substituting the specific values of

_{A}*S*and

_{xxa}*S*. The Simulated values of ${S}_{A}^{2},\hspace{0.17em}{S}_{W}^{2},\hspace{0.17em}{S}_{B}^{2},\hspace{0.17em}{\tilde{Y}}_{ijw},\hspace{0.17em}{\widehat{Y}}_{ijC},\hspace{0.17em}\text{and}\hspace{0.17em}{\widehat{Y}}_{ijB}$ are substituted into confidence intervals (10) to (15). For each of 25 designs with different combinations of

_{xxw}*I*and

*J*, 2000 iterations are simulated and two sided confidence intervals for

*E*(

*Y*) are computed. Confidence coefficients are determined by counting the number of the intervals that contain

_{ij}*E*(

*Y*). The average lengths of the two sided confidence intervals for

_{ij}*E*(

*Y*) are computed.

_{ij}Using the normal approximation to the binomial, if the true confidence coefficient is 0.90, there is less than a 2.5% chance that a simulated confidence coefficient based on 2000 replications will be less than 0.8868. The comparison criteria are: (i) the ability to maintain the stated confidence coefficient and (ii) the average length of two sided confidence intervals. Although narrower average interval lengths are preferable, it is necessary that an interval first maintain the stated confidence level.

Table 2 reports the range of simulated confidence coefficients for stated 90% two sided confidence intervals on *E*(*Y _{ij}*) with

*I*= 3, 10, 20 and

*J*= 3, 10, 20 as ${\sigma}_{O}^{2}$ and

*S*range from 0.1 to 0.9, respectively. The WG1, T1, BL1, WG2, T2, and BL2 methods refer to the intervals (10), (11), (12), (13), (14), and (15), respectively. The WG2, T2, and BL2 methods generally maintain the stated confidence level 0.9 across all combinations of

_{xxa}*I*and

*J*. However, the WG1, T1, and BL1 methods give confidence coefficients that are much below 0.8868 when I = 3. The WG1 method slightly maintains the stated confidence level when

*I*becomes greater than or equal to 10 whereas T1 method is too conservative since the simulated confidence coefficients are very close to 1.0. Thus WG1, T1, and BL1 methods are not generally recommended except WG1 method with

*I*≥ 10. The simulated confidence coefficients that are less than 0.8868 or abnormally greater than the stated confidence level are shown in boldface in Table 2.

Table 3 reports the range of the average interval lengths for WG1, WG2, T2, and BL2 methods with *I* = 3, 10, 20 and *J* = 3, 10, 20. The T1 and BL1 methods are eliminated because they do not generally maintain the simulated confidence level across all combinations of *I* and *J* in Table 2. The WG1 method is also eliminated for the same reason when *I* = 3. The three methods generally yield shorter average interval lengths as *I* and *J* increase since the degrees of freedom become large and the standard errors of OLSEs and BLUE of *E*(*Y _{ij}* ) become small. Although the average interval lengths of three methods change largely depending on the values of ${\sigma}_{O}^{2},\hspace{0.17em}{\sigma}_{E}^{2},\hspace{0.17em}{S}_{xxa}\hspace{0.17em}\text{and}{S}_{xxa}$, BL2 method generally produces the shortest interval length than T2 and WG2 methods from Table 3.

## 7. EXAMPLE APPLICATION

One of the manufacturing processes of integrated circuits in semiconductor technology is to connect individual transistors, capacitors, and other circuit elements with a conducting metallic material. These connections are typically built by first depositing a thin blanket layer of metal over the entire silicon wafer and then etching away unnecessary portion. Czitrom and Spagon (1997) presented a data set of a designed experiment in seven factors and two blocks in 38 runs with 14 responses to perform this connection process. Reflectivity (%) as a response variable *Y _{ij}* and resistivity (U-ohm cm) as a predictor variable

*X*are selected from the data set and they are shown in Table 4. Three operators (

_{ij}*I*= 3) are chosen and five measurements (

*J*= 5) are repeatedly conducted assuming a simple nested error regression model (1).

The data set in Table 4 is used to calculate confidence intervals for *E*(*Y _{ij}* ) . The overall means of reflectivity

*Y*.. and resistivity

*X*.. are respectively computed 12.047% and 87.333 U-ohm cm. Assume that a resistivity value

*X*= 85 U-ohm is given to estimate the value of reflectivity

_{ij}*Y*. In that case practitioners often consider the variabilities of

_{ij}*E*(

*Y*) to see if there is any change in the connection manufacturing process. We suggest confidence intervals for mean response 3. The results

_{ij}*E*(

*Y*) with certain degree of confidence by allowing the bound of errors rather than an estimated value ˆ

_{ij}*Y*when a resistivity value is given. The WG2, T2, and BL2 methods are suggested when resistivity value is 85 because they keep the stated confidence level when

_{ij}*I*= 3. The results are presented in Table 5. In order to choose an appropriate confidence intervals, one generally prefers the shortest confidence interval that keeps the stated confidence level. The BL2 method yields the shortest interval length among three methods. This result is consistent with simulation study because Table 3 presents the same pattern of average interval lengths when

*I*= 3.

## 8. CONCLUSIONS

This article presents statistical properties the OLSEs and BLUE of mean response *E*(*Y _{ij}* ) and independence of estimators and sums of squares appeared in a simple nested error regression model. A numerical example was illustrated to compute confidence intervals for

*E*(

*Y*) when a value of predictor variable

_{ij}*X*is given. We first recommend to apply WG2, T2, and BL2 methods to compute a confidence interval for

_{ij}*E*(

*Y*) when

_{ij}*I*is less then 10. We then suggest to choose the shortest confidence interval.

This article extends the researches of Park and Burdick (1993, 1994) and Park and Hwang (2002). This article proposes several confidence intervals for the mean response in a simple nested error regression model when a measurement is linearly related to a predictor variable in manufacturing processes for gauge R & R study. We specifically present alternative confidence intervals for mean response when an individual value of predictor variable *X _{ij}* is given rather than calculating the mean value of predictor variable

*X*. Future research includes inference concerning expected mean response in a regression model with two-fold nested error structure using various estimation methods.

_{i}