1. INTRODUCTION
A sizing system (or sizing chart) for multiplesize products (e.g., helmets and clothes) has been established using anthropometric data on a design target population. The sizing system not only provides designers with how many and what sizes need to be designed, but also helps users select a product of the proper size (Zhang et al., 2011;Mpampa et al., 2010;Chung et al., 2007;Gupta and Gangadhar, 2004). Therefore, the sizing system needs to anthropometrically accommodate the diverse body sizes of the design target population (Lee et al., 2015;Kwon et al., 2009;Hsu and Wang, 2005). To achieve this, existing studies have employed anthropometric data to create a sizing system. For example, Widyanti et al. (2017) and Skals et al. (2016) have applied anthropometric data to construct sizing systems for garments and helmets, respectively.
A sizing system for multiplesize products can be created in two steps: (1) selection of key body dimensions and (2) formation of size categories. In the first step, a few key body dimensions (e.g., one to three) are selected from the body dimensions related to anthropometric design of a target product (Gordon and Friedl, 1994;Hidson, 1991;RosenbladWallin, 1987;Zheng et al., 2007). The key body dimensions are generally chosen by analyzing statistical relationships among the body dimensions (Jung et al., 2010). In the second step, size categories are placed over the distribution of the key dimensions to cover as much of the design target population as feasible (Jung et al., 2009). The size of category (or size interval) is generally decided by compromising various aspects, including fitness level, production economy, and material properties (Kasambala et al., 2016;Hsu and Wang, 2005;McCulloch et al., 1998).
The methods of generating a sizing system can be classified into three types according to the techniques used in the size category formation: (1) heuristic method, (2) cluster method, and (3) optimization method. The heuristic method forms the size categories from the distribution centroid of the key dimensions to the distribution boundary until a designated percentage (e.g., 95%) of the target population is covered within the categories (Zheng et al., 2007;Kwon et al., 2009;RosenbladWallin, 1987;Robinette and Annis, 1986). The cluster method classifies target users into groups with homogeneous body sizes in the key dimensions using a clustering technique such as Kmean clustering (Laing et al., 1999). Lastly, the optimization method mathematically locates the size categories over the body size distribution of the target population in the key dimensions using an optimization algorithm such as NelderMead (McCulloch et al., 1998).
The existing optimization methods have mainly focused on loss score (or fitness penalty) in finding the optimal locations of size categories. Trypos (1986) determined the optimal locations of categories using an integer programming approach that minimized loss score, the extent of discrepancy between size categories and target users in the key dimensions. Similarly, McCulloch et al. (1998) and Ashdown (1998) utilized nonlinear optimization algorithms to search the optimal locations of categories which minimized loss score in the key dimensions. More recently, a few studies (Xia and Istook, 2017;Esfandarani and Shahrabi, 2012;Ding and Xu, 2008;Gupta et al., 2006) have also employed loss score to find the optimal location of categories.
Two significant factors (accommodation percentage and overlap area) need to be considered along with loss score in the mathematical optimization to establish a better sizing system. A good sizing system not only minimizes its loss score, but also maximizes its accommodation (coverage) percentage of the target population and minimizes the overlap area of categories. The accommodation percentage reflects how much percentage of the target population can be covered by the size categories. The overlap area represents the overlap area among the size categories which indicates the inefficiency of a sizing system. Although there is room for improvement by simultaneously considering the three criteria of loss score, accommodation percentage, and overlap area in a mathematical optimization, the existing studies have not considered all criteria at the same time.
The present study proposed a new desirability method to optimize a sizing system for multiplesize products. The proposed desirability method simultaneously optimized loss score, accommodation percentage, and overlap area by adopting a desirability function approach. To validate the proposed desirability method, this study applied it to the size system design for men’s long sleeves and compared its performance with a representative existing optimization method in terms of loss score, accommodation percentage, and overlap area. The desirability method can be utilized to determine an optimal sizing system for better size fitness, anthropometric coverage, and sizing system efficiency.
2. PROPOSED DESIRABILITY METHOD
This study developed a desirability method to establish an optimal sizing system using anthropometric data for a design target population. The desirability method can simultaneously optimize multiple criteria (or responses) by aggregating them into an overall desirability score (D). The overall desirability score is computed by referring to previous desirability approaches (Jeong and Kim, 2009;Derringer, 1994) as the geometric mean of individual desirability scores (d) on the three performance criteria (loss score, accommodation percentage, and overlap area) as shown in Equation 1. The ranges of the overall and individual desirability scores are between 0 (completely undesirable) and 1 (completely desirable).
where:

D = overall desirability score,

d_{l} = individual desirability score on loss,

d_{a} = individual desirability score on accommodation percentage, and

d_{o} = individual desirability score on overlap area.
Desirability functions have been used to convert each performance criterion into a corresponding individual desirability score. Depending on the characteristics of the performance criterion, different desirability functions need to be employed (Jeong and Kim, 2009;Candioti et al., 2014;Derringer, 1994;Derringer and Suich, 1980). If a performance criterion is maximized or minimized, a largerthebest (LTB) function or a smallerthebest (STB) function is used, respectively; if a performance criterion has a target value, a nominalthebest (NTB) function is employed. This study quantified the individual desirability scores on loss score, accommodation percentage, and overlap area using STB, LTB, and STB functions, respectively.
The individual desirability score on loss (d_{l}) was calculated in two steps. In the first step, loss score (l) was quantified by summing log absolute differences between users’ body sizes (b) and their closest size category (s), as shown in Equation 2. The loss score becomes zero when the absolute difference is less or equal to half a size interval; however, the loss score increases when the absolute difference is greater than half a size interval. In the second step, the individual desirability score on loss was calculated using the STB function, as shown in Equation 3. If the loss score exceeds a userdefined maximally acceptable loss score (L_{M}), the individual desirability score becomes 0; otherwise, the individual desirability score increases linearly apart from the maximally acceptable loss score.
where:

l = loss score,

n = number of target users,

m = category number,

k = number of key dimensions,

b_{ij} = body size of key dimension j of user i,

s_{mj} = value of key dimension j of size category m, and

R = size interval.
where:

d_{l} = individual desirability score on loss,

l = loss score, and

L_{M} = userdefined maximally acceptable loss score.
The individual desirability score on accommodation percentage (d_{a}) was quantified using the LTB function, as shown in Equation 4. First of all, accommodation percentage (p) was computed by referring to Jung et al. (2009) which represents the percentage of the target population that can be covered within the size categories. If the accommodation percentage exceeds a userdefined ideal percentage (P_{I}), the individual desirability score becomes 1; otherwise, the individual desirability score increases linearly until the accommodation percentage reaches the ideal percentage.
where:

d_{a} = individual desirability score for accommodation percentage,

p = accommodation percentage, and

P_{I} = userdefined ideal accommodation percentage.
The individual desirability score for the overlap area (d_{o}) was calculated using the STB function, as shown in Equation 5. An overlap area among sizing grids (o) was computed, which represents the inefficiency of the sizing system. If the overlap area exceeds a userdefined maximally acceptable overlap area (O_{M}), the individual desirability score becomes 0; otherwise, the individual desirability score increases linearly apart from the maximally acceptable overlap area.
where:

d_{o} = individual desirability score for overlap area,

o = overlap area of a sizing system, and

O_{M} = userdefined maximally acceptable overlap area.
3. PERFORMANCE EVALUATION
3.1 Method and Materials
3.1.1 Body Dimensions and Anthropometric Data
Nine body dimensions were selected for anthropometric design of men’s long sleeves by referring to existing studies (Beshah et al., 2014;Chun, 2012; Mpampa et al., 2009) and industrial practices (Modern Tailor, 2017;Tailor Store, 2017;Calvin Klein, 2017;Hugh and Crye, 2017;Macys, 2017;Nordstrom, 2017;Wrangler, 2017;Haggar, 2017;Russel Europe, 2017). The body dimensions consisted of three length dimensions, two breadth dimensions, and four girth dimensions. The anthropometric data for the selected body dimensions were prepared from the 1988 US Army data of 1774 men (Gordon et al., 1988) and their summary statistics are given in Table 1.
3.1.2 Key Body Dimensions
The optimal key dimensions for the sizing system design of men’s long sleeves were selected by two separate statistical analyses (regression analysis and factor analysis) and industrial practices survey. In the regression analysis, maximum average adjusted coefficients of determination (R^{2}) was analyzed by multiple regression analysis for different numbers of key dimension candidates, as shown in Figure 1, and by referring to Jung et al. (2009, 2010). The regression analysis of this study was performed using a program coded in Matlab (MathWorks, Inc., Natick, MA, USA). The 9 anthropometric dimensions were employed to generate sets of key dimension candidates consisting of 18 body dimensions. Then, multiple regression analysis was conducted using each set of key dimension candidates as regressors and nonkey anthropometric dimensions as dependent variables. For a single independent variable or regressor, a simple regression analysis was used. Meanwhile, stepwise regression analysis was performed for 28 regressors using p_{enter} = 0.05 and p_{remove} = 0.10 as default tolerance levels. The stepwise regression aimed to systematically adding and removing terms from a regression model according to their statistical significance on pvalue of an Fstatistic. Lastly, the maximum of average adjusted R^{2} values were identified for each number of key dimension candidate.
For illustration purpose, the identification process of the maximum of average adjusted R^{2} values for single key dimension candidate cases was shown in Table 2. For each individual key dimension candidate from BD1 to BD9, adjusted R^{2} values of regression equations were obtained for the other anthropometric dimensions and their average adjusted R^{2} value was calculated. As a result, the maximum of the adjusted R^{2} values was found 0.36 at BD8 for single key dimensions.
The adjusted R^{2} of the regression analyses indicates how much body size variability can be statistically explained using the selected key dimensions. This study determined the number of key dimensions to be two since average adjusted R^{2} increased significantly (48%) from one dimension (0.22) to two dimensions (0.33). However, the percentage increase after two dimensions started to decrease significantly (e.g., 6% increment from two dimensions (0.33) to three dimensions (0.35)). Based on the average and maximum adjusted R^{2} analysis results, two anthropometric dimensions (sleeve length, BD1 and chest girth, BD8) were chosen as the key dimensions, which showed the highest adjusted R^{2} (0.43) with the rest of the body dimensions. To estimate the other anthropometric body dimensions, these key dimensions (BD1 and BD8) were used as regressor on the regression equations in which were summarized in Table 3.
Factor analysis was intended to reduce the number of anthropometric dimensions to a smaller number of key dimensions called factors based on statistical relationships among anthropometric dimensions. The appropriate number of factors is determined based on the percentage of the variance explained and Eigen value. In factor analysis of this study, the same set of key dimensions was selected as in the regression analysis. Factor analysis was conducted using Minitab 14.0 (Minitab Inc., USA) with principal component as extraction method and Varimax rotation by referring to previous studies (Zheng et al., 2007;Chung et al., 2007;Bittner, 2000). The factor analysis on the nine body dimensions showed that two factors could explain 65% of the total variation as shown in Table 4. In addition, the Eigen value on the two numbers of factors was also agreed with the selection criteria (greater than 1) as shown on scree plot in Figure 2 (Jung, 2009;Hsu, 2009).
In factor analysis, factor loadings were calculated to determine the correlation coefficients between two factors and anthropometric body dimensions. The factors loadings of over 0.5 were observed to be grouped in the first and second factors (Table 4). The first factor was related to the girth and breadth dimensions, including chest girth, waist girth, neck girth, wrist girth, hip breadth, and back armpit distance, and it explained 43.3% of the total variance; the second factor was related to length dimensions, except shoulder breadth, and it accounted 21.7% of the total variance. In other words, Factor 1 was called girth and breadth factor, and Factor 2 was called length factor.
This study selected one key dimension from each factor group with the highest loadings (or correlation coefficient) by referring to previous works (Liu et al., 2016;Bagherzadeh et al., 2010;Hsu, 2009;Chung et al., 2007). The selected key dimensions were sleeve length (BD1) and chest girth (BD8).
The two key dimensions selected in this study were also widely used for the sizing system design of men’s long sleeves in the apparel industry. The clothing industry used one or both of sleeve length and chest girth as key dimension(s) to determine a sizing system for men’s long sleeves (Modern Tailor, 2017;Tailor Store, 2017;Calvin Klein, 2017;Hugh and Crye, 2017;Macys, 2017;Nordstrom, 2017;Wrangler, 2017;Haggar, 2017;Russel Europe, 2017).
3.1.3 Performance Evaluation and Analysis Protocol
In the predictive model development, existing studies suggested that it is necessary to verify that the fitted model can be generalized for other future data (Hawkins et al., 2003). Thus, to deal with this issue as well as to avoid any intentional evaluation bias, the US Army anthropometric data were randomly divided into learning and testing sets based on holdout validation or crossevaluation method (Jung et al., 2010). In the holdout method, data are randomly selected from original data to form learning set, and the remaining data are retained as the testing sets. In general, onethird of the original data is used for testing set. In this study, the learning set (n = 1,000) was selected from the US Army data and used for sizing system generation. Moreover, about onethird of the US Army data (n = 774) were randomly selected as testing set by eliminating the users being selected for the learning set. The testing set was used to evaluate the performance of the sizing systems created using the learning set.
The performance of the proposed desirability method was evaluated for different numbers of size category (5, 10, and 15) and compared with a representative existing optimization method (hereafter, reference optimization method). A sizing interval of 5 cm was applied for the selected key dimensions by referring to industry practices (Hugh and Crye, 2017). The evaluation was repeated 10 times to comprehensively validate the performance of the proposed desirability method.
An evaluation program was coded using Matlab (MathWorks, Inc., Natick, MA, USA) for efficient evaluation. The proposed desirability method was implemented with parameters required to calculate individual desirability scores (30 for maximallyacceptable loss score, 1 for ideal accommodation percentage, and 300 for maximallyacceptable overlap area). The values of those parameters were determined as thresholds by considering target product design, number of sizing category, number and types of key dimensions, sizing interval, as well as accommodation target percentage of this study. The maximally acceptable loss score in this study was determined as 30 (=log(2cm)×1000×0.1) to allow the size mismatch up to 2 cm for the unaccommodated users (assumed 10% of the users in the learning data set (1,000)) by the sizing system. The ideal accommodation percentage was set to 1, which means the perfect accommodation. Lastly, the maximallyacceptable overlap area was decided as 300 (= 30 × 10 grids ) to allow the size overlap up to 30 cm^{2} for each sizing grid with other grids. This implies that there may be about 3 cm^{2} overlap between adjacent grids. On the other hand, the parameters used in the proposed desirability method of this study were also implemented into the reference optimization methods (Xia and Istook, 2017;Esfandarani and Shahrabi, 2012;Ding and Xu, 2008;Chung et al., 2007;Gupta et al., 2006;McCulloch et al., 1998;Ashdown, 1998) for fair comparisons. The program automatically generated an optimal sizing system using the learning set and quantified three performance measures (loss score, accommodation percentage, and overlap area) of sizing systems using the testing set.
A twofactor (generation method and category number) analysis of variance (ANOVA) was conducted for the results using Minitab 14.0 (Minitab Inc., USA) at α = 0.05. The independent variables were generation method (2 levels: desirability method and reference optimization method) and size category number (3 levels: 5, 10, and 15). The dependent variables were loss score, accommodation percentage, and overlap area. The TukeyKramer test and simple effects test were employed as posthoc analysis on the significant independent variables and interactions at the same significance level.
3.2 Results
The proposed desirability method generated sizing systems that simultaneously optimize loss score, accommodation percentage, and overlap area. The overall distributions of the size categories generated by both methods were quite similar for the category numbers 5, 10, and 15, as shown in Table 5. However, the desirability method showed slightly better results in all performance measures except loss score as summarized in Table 6
Accommodation percentages were significantly different according to the generation method (F(1,54) = 35.57, p < 0.001) and category number (F(2,54) = 1037.44, p < 0.001), as shown in Figure 3. Tukey tests on the generation method showed that accommodation percentages of the proposed desirability method were about 4% greater than those of the reference optimization method. In addition, Tukey tests on the category number revealed that accommodation percentages increased dramatically as the category number increased from 5 to 15 for both methods.
Overlap areas were statistically different according to the category number (F(2,54) = 43.36, p < 0.001), as shown in Figure 5. Overlap areas for the proposed desirability method were similar to those of the reference optimization method at smaller category numbers 5 and 10; however, the overlap area for the proposed desirability method became smaller by 5.17 cm^{2} than that of the reference optimization method for the largest category number 15.
4. DISCUSSION
The present study developed a new desirability method to optimize a sizing system using anthropometric data on a target population. Previous researches employed loss score to find an optimal location of size categories over the body size distribution of a design target population (Esfandarani and Shahrabi, 2012;Ding and Xu, 2008;Gupta et al., 2006;Ashdown, 1998;McCulloch et al., 1998;Trypos, 1986); however, they did not simultaneously consider loss score (size fitness penalty), accommodation percentage (anthropometric coverage) and overlap area (sizing system inefficiency). The new desirability method proposed in this study can establish a sizing system that can simultaneously optimize all important criteria at the same time.
The accommodation percentages of the proposed desirability method were consistently higher than those of the reference optimization method. These results indicate that the desirability method can better locate the size categories to cover a design target population. In addition, the benefit in accommodation percentage for the desirability method was kept over different numbers of size category. Lastly, a significant linear increment on accommodation percentage was observed when the number of categories increased for both methods. The reason for this phenomenon is clear since more categories will cover more target users.
A good sizing system should minimize loss score (Chung et al., 2007;McCulloch et al., 1998;Ashdown, 1998). The loss scores of the reference optimization method were consistently less than those of the proposed desirability method. These results are because the reference optimization method attempted to allocate size categories in order to reduce loss score. However, the differences in loss score for both methods dropped (1.15 for 5 grids, 0.61 for 10 grids, and 0.25 for 15 grids) as the category number increased. This result implies that the advantage of the reference optimization method over the desirability method diminished as the category number increased.
The overlap area of the proposed desirability method was 5.17 cm^{2} smaller than that of the reference optimization method for the largest category number 15; however, the overlap areas of the proposed desirability method were similar (difference = 0.02 cm^{2} – 0.7 cm^{2}) to those of the reference optimization method for fewer categories 5 and 10. For fewer categories, both methods can successfully minimize overlap area since the desirability method explicitly minimizes overlap area and the reference optimization method implicitly minimizes overlap area by scattering the categories to minimize loss score. However, for the largest category number considered in this study, the reference optimization method seems to fail to implicitly minimize overlap area while minimizing loss score.
Weights can be introduced when aggregating three individual desirability scores into an overall desirability score. This study did not consider weights among the individual desirability scores. However, the introduction of weights can generalize the desirability method, as shown in Equation 6. For example, the weight on loss score (w_{l}) in the objective function can be increased to better emphasize loss score in optimization than the other two criteria (w_{a}, w_{o}). The sizing system formed with a higher weight on loss score may be closer to that of the reference optimization method, which mainly considers loss score in optimization.
where:
To generalize the results of this study, more comprehensive evaluations are needed in the context of various sizing system designs. For the large category number 15, the proposed desirability method outperformed the reference optimization method in accommodation percentage and overlap area with minor weakness in loss score. On the other hand, for smaller category numbers 5 and 10, the desirability method showed strength in accommodation percentage as well as weakness in loss score over the reference optimization method. Thus, the desirability method is strongly recommended to use when the size category number is relatively many. However, to conclude this, more comprehensive evaluations are further necessary since the performance of a sizing system can be affected by various factors such as sizing interval and key body dimensions.
5. CONCLUSION
This study developed a new desirability method to simultaneously optimize a sizing system with respect to loss score, accommodation percentage, and overlap area using anthropometric data. The performance of the desirability method was compared to that of the reference optimization method in a case study. The case study demonstrated that the desirability method was better than the reference optimization method when the size category number is relatively many. The desirability method proposed in this study can be applied to construct an optimal sizing system of multiplesize products for better size fitness, anthropometric coverage, and sizing system efficiency.