• Editorial Board +
• For Contributors +
• Journal Search +
Journal Search Engine
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.18 No.2 pp.252-259
DOI : https://doi.org/10.7232/iems.2019.18.2.252

# Wavelet-Based Dimensionality Reduction for Multiple Sets of Complicated Functional Data

Young-Seon Jeong*
Department of Industrial Engineering, Chonnam National University, Gwangju, Republic of Korea
Corresponding Author, E-mail: young.jeong@jnu.ac.kr
January 11, 2019 May 3, 2019 June 11, 2019

## ABSTRACT

Multiple sets of complicated functional data with sharp changes have appeared in many engineering studies for such purposes as monitoring the quality and detecting faults in manufacturing processes. Some of the data curves in these studies exhibit large variations in local regions. This paper present a wavelet-based data reduction procedure to reduce high dimensional functional data from manufacuring processes. The proposed method can characterize the variations of multiple curves at certain local regions. In addition, unlike existing methods, which is based on a single curve, the method can handle with multiple curves together for the reduction of high dimensional data having distinct structures. Evaluation with real-life data sets shows that the proposed procedure performs better than several techniques extended from methods based on single-curve-based data reduction.

## 1. INTRODUCTION

In diverse applications, the advancement of sensor technology can provide opportunities to collect large amounts of functional data to detect out-of-control when the process is changed. Moreover, in precision manufacturing, new automatic control rules based on detection of changes in means or variances are being developed with complicated functional data. Unlike the linear functional data studied in several applications, this paper focuses on complicated functional data, which have lots of sharp drops, spikes, fluctuations as shown in Figure 1. For example, Ko et al. (2010) and Jeong et al. (2013) used optical emission spectroscopy (OES) signal for the fault detection and classification of plasma etchers. Park et al. (2012) utilized high dimensional near-infrared (NIR) spectral data to predict biomass chemical composition by using data mining techniques. It is very important in the bio-based industry to predict the contents of carbohydrates, lignin, and ash in biomass. However, the direct approaches such as wet chemistry methods has several disadvantages that make it impractical, or even impossible, to be applied to online monitoring in the industry. Therefore, in bio-based industry, the analysis of NIR spectral data can save much time and expense, and it is much faster and simpler to predict the chemical information than direct analysis methods. Jin and Shi (1999) used functional data to detect types of stamping faults in automobile sheet-metal stamping process. Their tonnage signal shows that a sharp drop in a signal at a specific location indicates an “excessive-snap,” and a larger signal variation in one region indicates a “worn-gib” problem. Examples of high-dimensional functional data used in applications other than engineering studies include Morris et al. (2003) investigation of a biomarker in early colon carcinogenesis, and De Castro et al. (2005) modeling of functional sulfur dioxide samples for environmental monitoring. In addition, in Nortel’s wireless antenna manufacturing process, the shape of the lobes as shown in Figure 1 needs to follow certain structures. There are specifications for peak-height, lobe-size and the difference between the peaks and valleys. See Jeong et al. (2007) for a more detailed description of the process.

However, the nonlinear high dimensional functional data has too many variables to be monitored in which it leads to the results in weak power of detecting out-ofprocesses. Therefore, the development of data reduction methods has been a significant research topic for a long year and the reduced-size data should represent the characteristics of functional data when the process is out-ofcontrol. To overcome this problem, wavelet techniques have been widely used in the literature. The popularity of wavelets in recent engineering applications is caused by the availability of a fast algorithm for the discrete wavelet transform (DWT). The computational efficiency of the DWT is better than the other transforms. For example, the principal component analysis (PCA) requires solving an eigenvalue system that is an expensive O(N3) operation in which N is the data size. The Fast Fourier Transform (FFT) requires O(N log N) operations, but a fast wavelet transform only requires O(N) operations. Wavelets also support several data compression and data reduction procedures. For example, Ko et al. (2010) used engineering knowledge to segment data in order to isolate fault types in semiconductor processes. The wavelet coefficients were selected using a procedure of vertical-energythresholding (VET) rule. Both segmentation and thresholding reduce the amount of data for the efficiency of further decision making. In addition, in monitoring and controlling a nano-machining process, Ganesan et al. (2003) pointed out that the speed of analyzing data and making process control decisions must be faster than the material removal speed (100-800 nano-meters per minute); otherwise useful materials on a ultra-thin wafer will be removed. In that example, the DWT was used as a filter to reduce the dimension of data and to handle complicated signals. Then, fault monitoring and variance change-point detection procedures (Das et al., 2005) were developed to meet the manufacturing requirements. In addition, wavelet- based models could effectvely extract a few wavelet coefficients with the smaller data reconstruction error to deal with irregularities such as sharp jumps and drops. Recently, several data-reduction procedures have been proposed by wavelets and PCA, and independent component analysis (ICA). Artoni et al. (2018) developed a dimension reduction method to analyze EEG data by combing PCA with ICA.

In addition, for multiple sets of complicated functional data, Lada et al. (2002) developed a model to detect fault patterns in semiconductor manufacturing with multiple functional data collected by sensors. In addition, Paynabar and Jin (2011) developed a wavelet-based mixed-effect model to characterize within- and betweenprofile (nonlinear) variations. The profiles of pressing force signals obtained from a valve seat assembly procedure were used to illustrate the work. In cancer research, Morris and Carroll (2006) used Bayesian wavelet-based functional mixed models to characterize complicated nonlinear functional data.

The main goal of this paper is to employ a method in data reduction to extract representative wavelet variables for multiple sets of complicated functional data. Unlike existing literatures, this paper deals with a multiple-curve wavelet thresholding and data reduction procedure. The case study shows that the proposed wavelet model can reduce data to a smaller scale than standard procedures and also demonstrates that process fault detection tools developed from the reduced-size data can efficiently identify process problems.

This paper is organized as follows: Section 2 briefly reviews the wavelet background and its properies, and Section 3 presents data reduction model based on wavelet trnasforms for multiple curves. The performance comparision with real-life examples are given in Section 4, and the conclusion and suggestions for possible future works are offered in Section 5.

## 2. WAVELET TRANSFORMS

Denoted by $y i = [ y i 1 , y i 2 , ⋯ , y i N ] T$ a vector of N equally spaced data points (or curve) at data curve i obtained from a manufacturing process, where N = 2J with some positive integer J and i = 1, 2, ⋯, M for independently replicated curves. The superscript T represents the transpose operator. Let $Y = [ y 1 T , y 2 T , ⋯ , y M T ] T$ . When a DWT W is applied to the data Y, the vector of wavelet coefficients obtained from this transformation is D = YW, where $D = [ d 1 T , d 2 T , ⋯ , d M T ] T , d i = [ d i 1 , d i 2 , ⋯ , d i N ] T ,$dim is the wavelet coefficient at the mth wavelet-position for the i th data curve, and W = [hij], for i, j = 1, 2, ⋯, N is the orthonormal N × N wavelet-transform matrix.

The original observations Y can be reconstructed using the inverse DWT. That is, through Y=DWT, the original data can be expressed as a linear sum of products of wavelet coefficients (cL,k and dj,k) and their corresponding wavelet-basis functions (ϕL,k(t) and ψj,k(t) as follows:

$f ˜ t = ∑ k = 0 2 L − 1 c L , k ϕ L , k t + ∑ j = L J ∑ k = 0 2 j − 1 d j , k ψ j , k t$

where the resolution level J is greater than the coarsest level L. To keep the notation simple, all wavelet coefficients, $[ c t , L , 0 , ⋯ , c t , L , 2 L − 1 , d t , L , 0 , ⋯ , d t , J , 2 J − 1 ]$, are represented by a simple $d t = [ d t 1 , d t 2 , ⋯ , d t N ] T$ vector. Note that if there are N data points, there will be N wavelet coefficients. With these N wavelet coefficients, the original data curve can be “reconstructed” using Eq. (1) without any deviation.

To capture the between-curve variations, this paper presents a wavelet-based local random-effect model to capture the characteristics of non-linear functional data. To elaborate, the following three subfigures in Figure 2 are generated from our model based on Haar wavelets using different sizes of supports covering local betweencurve variations. In each sub-figure, 20 curves of 256 data points in the time domain are generated based on a wavelet random-effect model from the Haar family in which the variance equals four. In Figure 2(a) one wavelet coefficient (c4,7) at a coarser level in the fourth resolution level is assumed to be a random effect. The support of this coefficient covers from t97 to t112. Figure 2(b) shows a wider support area [t65, t96] of a coarser level wavelet coefficient (c3,3) in the third resolution level. Figure 2(c) shows a much wider support area [ 65 t , 128 t ] of a coarser wavelet coefficient (c2,2) in the second resolution level. Section 3 shows that the proposed model allows one to zoom into a certain local area to understand why the process has more variations at that specific location and investigate the potential causes. Thus, as motivated by the above data compression and data reduction literature, our goal is to present a data reduction procedure for extracting representative variance parameters for multiple functional data.

## 3. WAVELET BASED DATA REDUCTION METHOD

To extract significant varaibles in the wavelet domain for multiple curves, we adopted the wavelet mean variance thresholding (WMVT) procedure proposed by Jeong et al. (2018). Denote θij the j th true wavelet coefficient for the i th curve and dij the sample version of θij. The dij ’s are independent and $N ( θ i j , σ 2 )$ is normally distributed, where θij ’s and σ2 are unknown parameters to be estimated and assume that $θ 1 j = … = θ M j ≡ θ ⋅ j$ for keeping the model simple. θij ’s are modeled as random- effects such as $θ i j ∼ N ( θ ⋅ j , τ j 2 )$, where θ⋅j measures the average value of wavelet coefficients in the j th position while $τ j 2$ is the wavelet-position-dependent variance. To simplify expressions, we assume that $θ i j ∼ N ( θ ⋅ j , τ j 2 )$ with the convention that $τ j 2 = 0$ implies a fixed-effect model of θ⋅j. Thus, the local random-effect model can be expressed as follows:

$D = Θ + Z$

where D = [dij] is a M × N vector of all DWT transformed wavelet coefficients, $Θ = [ θ 1 T , … , θ M T ] T , θ i = [ θ i 1 , θ i 2 , … , θ i N ] T , Z = [ z 1 T , … , z M T ] T$, and zi is a column of 1 × N random errors from the normal distribution $N ( 0 , σ 2 + τ j 2 )$.

Assume that all coefficients are random and random effect coefficients are independent (Guo, 2002;Morris and Carroll, 2006). Then, the distribution of dij follows the normal distribution with the mean of . θj and variance of $σ 2 + τ j 2$. Then, the likelihood function is given as follows: $D = 2 π − N M 2 σ 2 + τ j 2 − M 2 exp − 1 2 ∑ i = 1 M ∑ j = 1 N d i j − θ j 2 σ 2 + τ j 2$. Maximizing D is equivalent to minimizing the negative log likelihood function:

$− 2 ln D = K + M ∑ j = 1 N ln ( σ 2 + τ j 2 ) + ∑ i = 1 M ∑ j = 1 N ( d i j − θ ⋅ j ) 2 σ 2 + τ j 2$

where K is a constant independent of mean and variance parameters. Thus, estimation of the mean and variance parameters can be achieved by minimizing the above negative log-likelihood, i.e.,

$M ∑ j = 1 N ln ( σ 2 + τ j 2 ) + ∑ i = 1 M ∑ j = 1 N ( d i j − θ ⋅ j ) 2 σ 2 + τ j 2$

Note that we make $τ j 2 = 05$ even though σ2 ≥ 1. To encourage sparsity among θ⋅j’s and τ⋅j’s to keep the number of coefficients small for the purpose of data reduction, we impose two penalties (λ1 and λ2) at the end of the log-likelihood function:

$M ∑ N j = 1 ln ( σ 2 + τ j 2 ) + ∑ M i = 1 ∑ N j = 1 ( d i j − θ ⋅ j ) 2 σ 2 + τ j 2 + λ 1 ∑ N j = 1 θ ⋅ j + λ 2 ∑ N j = 1 τ j 2$
(1)

By using an iterative procedure, the estimation of each parameter is obtained as follows (See Jeong et al., 2018 for details):

### 3.1 Initialize an Estimate of σ2 :

The initial estimate of σ2 can be obtained from the following pooled variance. For each curve, the common variance σ2 for M curves can be estimated by averaging these robust estimates (Donoho and Johnston’s (1994): $σ ^ = M − 1 ∑ i = 1 M 0.6745 − 1 m e d i a n ( d i m : N / 2 + 1 ≤ m ≤ N )$, where the index m indicates wavelet coefficients at the finest level.

### 3.2 Initialize an Estimate of τj ’s:

• (i) If the sample variance of d⋅j is larger than the current estimate of σ2, estimate $τ j 2$ by the difference between the two. That is, this position of wavelet coefficients has a random effect.

• (ii) Otherwise, estimate $τ j 2$ by zero.

### 3.3 Update θ⋅j ’s by Minimizing Eq. (1) with RESPECT to θ⋅j ’s:

By minimizing the penalized log-likelihood function with respect to θ⋅j ’s, we obtain the following closed form solution for the estimate of θj ’s.

$θ ⋅ j = | d ¯ ⋅ j | − λ 1 ( σ 2 + τ j 2 ) / ( 2 M ) + s i g n ( d ¯ ⋅ j ) ,$

where $( x ) + = max ( x , 0 ) d ¯ ⋅ j = ( d 1 j + … + d M j ) / M$ .

### 3.4 Update $τ j 2$ by Minimizing Eq. (1) with Respect to τj ’s:

Similarly, by minimizing the penalized log-likelihood function with respect to τj ’s and by defining $s j 2 = ∑ i = 1 M ( d i j − θ ⋅ j ) 2 / M$, we can also obtain a closed form solution for the estimate of $τ j 2$ as follows (see Appendix for its derivation):

$τ j 2 = − 1 + 1 + 4 s j 2 λ 2 / M 2 λ 2 / M − σ 2 +$

### 3.5 Update σ2 by Minimizing Eq. (1) with Respect to σ2 :

We can solve the following equation to obtain the updated estimate of σ2 :

$∑ j = 1 N σ 2 + τ j 2 − s j 2 ( σ 2 + τ j 2 ) 2 = 0$

## 4. COMPARISON STUDIES USING REAL-LIFE EXAMPLES

Intuitively, single-curve wavelet thresholding and data reduction procedures based on single curves (Donoho and Johnstorne, 1994) can be extended to multiple curves. However, unless all of the curves are considered together, the wavelet positions selected from individual curves usually differ. This complicates the decision of which positions represent all M curves. One approach is to use the union method for including positions selected from different curves (Mörchen, 2003). However, this method can be expected to use many positions and thus does not function well for data reduction. An alternative is to take only the intersection of positions, i.e., include a position only when it is selected in all curves (Weng and Young, 2017). Although this approach might yield good data reduction, its modeling performance is often poor. Some other procedures such as majority voting (e.g., if three or more coefficients from six curves are selected at one position, the position is included) can be used. However, all of these approaches are ad hoc and lack a solid justification based on models for M curves. More importantly, unlike our wavelet local-random-effect model, these procedures do not model local variations explicitly.

In attempting to use information from all curves to decide “representative” wavelet positions, Jung et al. (2006) developed a vertical-energy-thresholding (VET) procedure. The VET considers the following energy measure based on the squared L2 norm of coefficients from all curves in one wavelet position j:

$∥ d v j ∥ 2 = d 1 j 2 + d 2 j 2 + … + d M j 2 , j = 1 , 2 , … , N$

Then, extending the data reduction idea studied in Jeong et al. (2006), as the next step the VET used the objective function below to balance the modeling accuracy and data reduction goals and chose λ as a threshold value of energy measures at all positions:.

$O V E T λ = ∑ j = 1 N E [ ∥ d v j ( 1 − I ( ∥ d v j ∥ 2 > λ ) ) ∥ 2 ] ∑ j = 1 N E [ ∥ d v j ∥ 2 ] + ξ ∑ j = 1 N E [ I ( ∥ d v j ∥ 2 > λ ] N$

When the energy of the j th position, $∥ d v j ∥ 2$ is larger than the threshold value, all estimates of coefficients $( θ ^ 1 j , … , θ ^ M j ) = ( d 1 j , d 2 j , … , d M j )$ at that wavelet position j (for j = 1, 2, …, N) are retained. Then, inverse DWT can be applied to these retained wavelet coefficients to reconstruct the original multiple curves. Chang and Vidakovic (2002) assumed that the true model in the wavelet domain is $d i j = θ . j + z i j$, where zij are random variates from N(0,σ2 ) distribution, and used a Bayesian formulation to develop a “Stein-type” shrinkage method called “VertiShrink” to estimate wavelet coefficients . θj ’s by maximizing a predictive density of the waveletcoefficients across all curves at a particular position. The following block-vertical estimate was proposed:

$θ ^ . j = ( 1 − M σ ^ 2 ∥ d v j ∥ 2 ) + d ¯ . j , j = 1 , 2 , ⋯ , N$

In this section, two real-life data sets are used to compare them with our WMVT method. The evaluation of their performance uses the following criteria commonly seen in the wavelet thresholding and signal compression literature. Note that there are a total of M × N data points from M curves with N wavelet positions.

• (1) K1 : number of non-zero mean wavelet coefficients;

• (2) K2 : number of positions with wavelet randomeffects;

• (3) K3 : number of wavelet coefficients used to reconstruct multiple curves; for example, when VET selects KVET positions, it uses M × KVET to reconstruct the data curves;

• (4) Data reduction ratio (DRR): DRR = (K1 + K2 + 1)/(MN) for the WMVT procedure or (K3 + 1) /(MN) for others;

• (5) Relative error: RelError $= ∑ i = 1 M y i − f ^ i / ∑ i = 1 M y i ,$, which is the L2 error normalized by the signal energy;

• (6) Root mean square error (RMSE): RMSE = $1 M ∑ i = 1 M y i − f ^ 2$

Experiment 1-Antenna Signals: The popularity of wireless communications has increased the need for high quality, technically sophisticated antennae. We collected data sets to develop procedures to monitor antenna manufacturing quality and detect process problems. Equipment used in such testing receives antenna signals at different degrees of elevation and azimuth (Jeong et al., 2007). This study focuses on the zero-azimuth cut data curves generated from 20 antenna data sets under normal conditions. The typical pattern of antenna data under normal conditions is shown in Figure 1. The antenna quality is evaluated according to various regulations regarding the signal patterns.

Figure 3 shows the reconstructed multiple curves based on different procedures. In particular, Figure 3(a) shows the reconstructed curves from the WMVT procedure with λ1 =150 and λ2 = 600. Figure 3(b) indicates that the VertiShrink (designed for estimating the baseline curve) does not capture the local variations. Figure 3(c)-(e) shows that the VET, Visu-Union and Visu-Intersection have larger modeling errors around the center locations and failed to capture the local variations along the two sides. Table 1 summarizes comparisons of five different procedures for antenna signals. The WMVT procedure uses a total of 86 coefficients (K1 + K2 = 86 and one from σ2 ) to model the curves. Among the five procedures, three - the VET, Visu-Union, and Visu-Intersection - use a much larger number of coefficients. Except the VertiShrink, the VET, Visu-Union, and Visu-Intersection procedures use much more number of coefficients. The VET and Visu-Intersection have larger relative errors, and the Visu- Union has relative error similar to the WMVT. However, because Visu-Union uses too many coefficients, its DRR are larger than the others. The WMVT has the both smaller DDR and RMSE.

Experiment 2 - Tonnage Signals: Tonnage signals were used for monitoring and diagnosis of a stamping process (Jin and Shi, 1999). Tonnage signals contain process information relating to the deformation stage. Figure 4(a) shows 24 sets of tonnage signals under normal working conditions, and the data size of each curve is 256. Figure 4(b) shows only the center area and indicates that all tonnage signals have similar characteristics, but the center area has a larger local between-curve variation. The local-variations are contributions of the randomness of the distribution of lubricants and material uniformity.

Figure 5 shows the reconstructed multiple curves for tonnage signals from different procedures. In particular, Figure 5(a) shows the reconstructed tonnage curves from the WMVT procedure with λ1 = 400 and λ2 = 4000 . The WMVT procedure uses 22 non-zero wavelet coefficients for mean modeling and three wavelet random-effects for variance modeling. All other procedures except Verti- Shrink perform well in capturing the local variations around the center, but they use too many wavelet coefficients. Table 2 summarizes numerical comparison results. As in the antenna example, Visu-Union has the smallest relative error, but its DRR are the largest because it uses too many coefficients. VET has a smaller DRR, but these are still much larger than those of the WMVT. Compared with the data reduction and relative errors in antenna curves, WMVT, VertiShrink and VET procedures perform better for tonnage curves. Neither the Visu-Union nor the Visu-Intersection procedure works well with tonnage curves even when the curves involved are much smoother and the local variations are smaller.

## 5. CONCLUSION AND FUTURE RESEARCHES

This paper presented the wavelet-based data reduction model to reduce a large volume of functional data into a representative data set by considering the characteristics of the data. Unlike existing data reduction methods for single curves, the presented WMVT proceudre considered all of the curves together, which complicates the decision of which positions represent all M curves. By using the WMVT method, the original functional data can be reconstructed with limited reconstruction errors. It is signficnat to deal with huge amounts of high dimensional data coming from various data-collection instruments in diverse applications. Based on real-life data analyses, we found that the WMVT model adequately describes local variations and uses fewer model parameters to reduce data dimension. Our future research will concentrate on root cause diagnosis with the selected wavelet coefficients where the decision rules derived from the reduced-size data is satisfactory. In addition, the proposed method in this paper can be extended to more sophisticated functional data from diverse applications.

## ACKNOWLEDGEMENTS

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2017S1A5A8018897).

## Figure

Antenna signals.

Local wavelet random-effect models based on HAAR wavelet family.

Reconstructed multiple curves for antenna signals.

Tonnage curves.

Reconstructed multiple curves for tonnage signals.

## Table

Comparison results for antenna curves

Comparison results for tonnage curves

## REFERENCES

1. Artoni, F. , Delorme, A. , and Makeig, S. (2018), Applying dimension reduction to EEG data by principal component analysis reduces the quality of its subsequent independent component decomposition, NeuroImage, 175, 176-187.
2. De Castro, B. F. , Guillas, S. , and Manteiga, W. G. (2005), Functional samples and bootstrap for predicting sulfur dioxide levels, Technometrics, 47(2), 212-222.
3. Chang, W. and Vidakovic, B. (2002), Wavelet estimation of a base-line signal from repeated measurements by vertical block shrinkage, Computational Statistics and Data Analysis, 40(2), 317-328.
4. Das, T. K. , Ganesan, R. , Sikdar, A. , and Kumar, A. (2005), Online end point detection in CMP using SPRT of wavelet decomposed sensor data, IEEE Transactions on Semiconductor Manufacturing, 18(3), 440-447.
5. Donoho, D. L. and Johnstone, I. M. (1994), Ideal spatial adaptation by wavelet shrinkage, Biometrika, 81(3), 425-455.
6. Ganesan, R. , Das, T. K. , Sikder, A. K. , and Kumar, A. (2003), Wavelet-based identification of delamination defect in CMP (Cu-low k) using nonstationary acoustic emission signal, IEEE Transactions on Semiconductor Manufacturing, 16(4), 677-685.
7. Guo, W. (2002), Functional mixed effects models, Biometrics, 58(1), 121-128.
8. Jeong, M. K. , Lu, J. C. , Huo, X. , Vidakovic, B. , and Chen, D. (2006), Wavelet-based data reduction techniques for process fault detection, Technometrics, 48(1), 26-40.
9. Jeong, M. K. , Lu, J. C. , Zhou, W. , and Ghosh S. K. (2007), Data-reduction method for spatial data using a structured wavelet model, International Journal of Production Research, 45(10), 2295-2311.
10. Jeong, Y. S. , Jeong, M. K. , Lu, J. C. , Yuan, M. , and Jin, J. (2018), Statistical process control procedures for functional data with systematic local variations, IISE Transactions, 50(4), 448-462.
11. Jeong, Y. S. , Kim, B. , and Ko, Y. D. (2013), Exponentially weighted moving average-based procedure with adaptive thresholding for monitoring nonlinear profiles: Monitoring of plasma etch process in semiconductor manufacturing, Expert Systems with Applications, 40(14), 5688-5693.
12. Jin, J. and Shi, J. (1999), Feature-preserving data compression of stamping tonnage information using wavelets, Technometrics, 41(4), 327-339.
13. Jung, U. , Jeong, M. K. , and Lu, J. C. (2006), A Vertical energy thresholding procedure for data reduction with multiple complex curves, IEEE Transactions on Systems, Man, Cybernetics, Part B, 36(5), 1128-1138.
14. Ko, Y. D. , Jeong, Y. S. , Jeong, M. K. , Garcia-Diaz, A. , and Kim, B. (2010), Functional kernel-based modeling of wavelet compressed optical emission spectral data: Prediction of plasma etch process, IEEE Sensors Journal, 10(3), 746-754.
15. Lada, E. K. , Lu, J. C. , and Wilson, J. R. (2002), A wavelet- based procedure for process fault detection, IEEE Transactions on Semiconductor Manufacturing, 15(1), 79-90.
16. Mörchen, F. (2003), Time series feature extraction for data mining using DWT and DFT, University of Marburg, Technical Report 33, Department of Mathematics and Computer Science, Marburg, Germany
17. Morris, J. S. and Carroll, R. J. (2006), Wavelet-based functional mixed models, Journal of the Royal Statistical Society: Series B, 68(2), 179-199.
18. Morris, J. S. , Vannucci, M. , Brown, P. J. , and Carroll, R. J. (2003), Wavelet-based nonparametric modeling of hierarchical functions in colon carcinogenesis, Journal of the American Statistical Association, 98(463), 573-583.
19. Park, J. I. , Liu, U. , Ye, X. P. , Jeong, M , K. ,Jeong, and (2012), Improved prediction of biomass composition for switchgrass using reproducing kernel methods with wavelet compressed FT-NIR spectra, Expert Systems with Applications, 39(1), 1555-1564.
20. Paynabar, K. and Jin, J. (2011), Characterization of nonlinear profiles variations usingmixed-effectmodels and wavelets, IIE Transactions, 43(4), 275-290.
21. Weng, J. and Young, D. S. (2017), Some dimension reduction strategies for the analysis of survey data, Journal of Big Data, 43(4), https://doi.org/10.1186/s40537-017-0103-6.