• Editorial Board +
• For Contributors +
• Journal Search +
Journal Search Engine
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.20 No.1 pp.35-47
DOI : https://doi.org/10.7232/iems.2021.20.1.35

# A Latent Class Analysis for Item Demand Based on Temperature Difference and Store Characteristics

Yuto Seko*, Ryotaro Shimizu, Gendo Kumoi, Tomohiro Yoshikai, Masayuki Goto
Graduate School of Creative Science and Engineering, Waseda University, Tokyo, Japan
Japan Weather Association, Tokyo, Japan
Department of Industrial and Management Systems Engineering, Waseda University, Tokyo, Japan
*Corresponding Author, E-mail: yutoseko@ruri.waseda.jp
March 17, 2020 September 21, 2020 January 30, 2021

## ABSTRACT

In retail stores, there is an increasing need for predicting item demand using accumulated purchase history data to cope with the fluctuating consumer demands. These fluctuations in item demand are influenced by external factors and consumer preferences. Among these, store characteristics and weather conditions, which are closely related to consumer behavior, have strong effects on item demand. For this reason, it is very important to quantitatively grasp demand fluctuations of items that are influenced by changes in weather conditions for each store by using an integrated analysis of the purchase history data of many stores and weather conditions. In this research, we focus on the temperature difference, which is the average temperature difference from the previous day, as a weather condition affecting item sales. Because consumer feeling about a temperature is dependent on the temperature difference from the previous day, it is meaningful to construct a prediction model using this information. In this research, we propose a latent class model to express the relationship between weather conditions, store characteristics, and item demand fluctuation. Also, through an analysis experiment using an actual data set, we show the usefulness of the proposed model by extracting items that are influenced by weather conditions.

## 1. INTRODUCTION

In retail stores, such as supermarkets, predicting consumer demand fluctuation is an important task for coping with the changes in daily life. In recent years, by utilizing a large amount of accumulated purchase history data, there is a growing need for predictions of daily changes in demand fluctuation and acquisition of information, leading to marketing strategies (Sagawa and Hirooka, 2003;Tsukasa et al., 2011a;Abe and Kondo, 2005). Conventionally, the questionnaire survey is sometimes made use of the analysis in marketing field (For example, see Mohammadi and Sohrabi, 2018;Jermsittiparsert et al., 2019;Normalini et al., 2019). Though the usefulness of the questionnaire survey has not forfeited, the accumulated purchase history data has also become an important information source for marketing analysis these days. For example, by analyzing POS (point of sales) data, which includes the information about when, where, which items, and how many items are sold, it is possible to predict consumer demand, leading to efficient strategies (Nakayama, 2003;Goto et al., 2012).

However, such demand fluctuation is thought to be affected by changes in consumer preferences and external factors. In particular, store characteristics and weather conditions, which are closely related to consumer behavior, are thought to have strong effects on item demands. For this reason, it is very important to quantitatively grasp demand fluctuations for items that are influenced by changes in weather conditions (what we call “weather sensitive items”) for each store using an integrated analysis of purchase history data from many stores and weather conditions. However, in the case of a retailer chain that has many stores, construction of prediction models for every store is problematic in that it leads to very complicated models, and there is a possibility that there will not be a sufficient amount of data to ensure the creation of each model.

On the other hand, the latent class model (Green et al., 1976;Hoffman and Puzicha, 1999; Swait and Adnmowicz, 2001; Bhatnagar and Ghose, 2004;Bishop, 2006;Goto and Kobayashi, 2014;Goto et al. 2015) is well known as an effective model for analyzing marketing data, which are collection of different characteristics. In the latent class model, latent variables are assumed between observation variables, and it becomes possible to model the relation between heterogeneous data. Applying the latent class model to purchase history data, assuming latent variables between users and items, it is possible to model a heterogeneous group of users and items. In the marketing field, the assumption that the market consists of several user segments with similar preferences and plural item groups with similar characteristics is usually reasonable. Therefore, the latent class model is a very compatible model for market analysis. Also, it enables one to understand the huge number of users and items in terms of latent variables (Hoffman and Puzicha, 1999; Ishigaki, 2011). Since the influence of fluctuations in weather conditions on item demands can differ depending on the characteristics of stores, such as regional characteristics and the main types of customers, an analytical model taking account of this issue is necessary.

In this research, we propose a latent class model to express the relationship between weather conditions, store characteristics, and item demand fluctuation. This makes it possible to analyze the co-occurrence relationships of these variables. In addition, through an analysis experiment using an actual data set, we show the usefulness of the proposed model by extracting weather sensitive items. As a result, it is possible to quantitatively understand fluctuations in demand, which will help marketing strategies, including item management and inventory control, in real stores.

## 2. PREPARATION

In this section, we discuss weather conditions, store characteristics, item categories, and the analysis period as a preparation for modelling.

### 2.1 Weather Conditions

In this research, we focus on temperature difference, which is the difference in average temperature from the previous day, as a weather condition affecting item sales. Even if the average temperature on one day is the same as another day, consumer feelings may differ from one day to another. This is because their feelings about the weather depend on its difference from the previous day. Consumers tend to feel “cold” if it was warm on the previous day; conversely, they feel “warm” if it was cold. As men-tioned above, an analysis focusing only on average temperature cannot account for the effects of temperature change.

Therefore, focusing on temperature difference can be a useful index for a marketing strategy. This is because the average temperature on the previous day, which is a reference, is the observation variable, and we can focus on the ascent/descent from the observed temperature. In fact, temperature difference is a weather condition that can be easily analyzed in real stores.

### 2.2 Store Characteristics

In order to extract store characteristics, we make use of the customer purchase data for each store. Normally, because the characteristics of each store’s main customer differ between stores and depend on location conditions, items demanded are also different. Therefore, we try to take into account a store’s characteristics in constructing a model of item demands.

In addition, when we analyze the relation between weather conditions and item demand in each store, it leads to the discovery of other stores with similar trends and weather conditions. This analysis may contribute to identifying stores with similar potential demand trends. For example, let us assume that sales trends and the effect of weather conditions at Store A are similar to those of Store B. In this example, it is thought that items that are selling well in Store B will potentially grow as items that can also be sold at Store A. In this way, it is considered that potential demand trends can be extracted by finding similar stores.

### 2.3 Item Categories and Analysis Periods

In this research, we focus on perishables as an item category to be studied. The demand for perishables is considered to be affected by weather conditions, and inventory control for these items is important because long-term preservation is difficult. Several aspects of the processing and packaging methods of perishables provided to consumers can be changed at the operation level, such as in each store. For this reason, these items can flexibly respond to changes in demand, which is useful information that directly leads to merchandising strategy.

As the analysis period in this study, a short period within a year is targeted, such as a month. Figure 1 shows the yearly changes in the average temperature in Japan.

As Figure 1 shows, Japan has four seasons, and there is a considerable difference in temperature depending on the season. Therefore, the sales of seasonal items have a strong correlation with temperature over a long-term period, such as the twelve months a year. Therefore, it is thought to be difficult to extract items that are not affected by season from an analysis of sales amounts over a long-term period. Though the analysis of seasonal products is also important, this research focused on demand fluctuation of perishables, which should be managed at the oper-ation level.

### 2.4 Related Work

We focus on the problem of how to analyze the relationship between weather conditions, store characteristics, and item demand fluctuation. Conventionally, the demand forecast problem for individual products has been discussed for a long time, and many types of researches have been conducted. The linear regression analysis with the shop information and weather conditions as explanatory variables is a conventional prediction model with the simplest structure. Usually, the regression models by using several types of machine learning can be applied to the same problem. In addition, the time series models, such as the AR (Autoregressive), ARMA (Autoregressive moving average), and ARIMA (Autoregressive integrated moving average) models can be applied to this problem. The above models can be applied to the problem of constructing the demand forecasting models for each item and shop independently. However, such modeling is not effective in practice because there are many chain stores and also many items to be managed. It is difficult for field-level retail staff to make use of the vast number of predictive models during their daily hectic tasks.

On the other hand, Okayama et al. (2019) proposed an analytical model based on Nonnegative Tensor Factorization with a tensor represented by a three-dimensional array of ‘dates’, ‘items’, and ‘stores’ (Okayama et al., 2019). This method provides a straightforward model representing the relationship between weather conditions, store location, and items. Their model enables to identify important items for each shop cluster and allow a multifaced analysis. However, their model is based on tensor decomposition, so that it is not a generative model. This paper proposed a latent class model to represent the relationship between items, temperature differences, and stores that enable to give a probabilistic interpretation.

A latent class model (Hagenaars and McCutcheon, 2002;Magidson and Vermunt, 2002;Bishop, 2006) is an effective tool to analyze realistic complex problems such as heterogeneous data being mixed. It assumes that the data is generated by following a mixture model of several different probability distributions. The latent class model is shown to be useful for analysis of text data (Hofmann, 1999;Blei et al., 2003;Ueda, 2004;Xue et al., 2008;Yamamoto et al., 2017), collaborative filtering (Jin et al., 2003; Si and Jin, 2003;Hofmann, 2004;Jin et al., 2006;Suzuki et al., 2014;Adalbjörnsson et al., 2016) and analysis of marketing data (Green et al., 1976; Swait and Adnmowicz, 2001; Iwata et al., 2009;Train, 2009;Goto et al., 2015;Nagamori et al., 2019) in many previous researches.

## 3. PROPOSED ANALYSIS METHOD

In this research, we propose a latent class model (Bishop, 2006;Goto and Kobayashi, 2014), which enables an analysis of the co-occurrence of items, temperature differences, and store characteristics from accumulated purchase history and weather data. To analyze the overall sales trend for all stores, it is considered that a method for clustering stores and items with high similarity in terms of weather sensitivity at the same time is very useful. The latent class model is most effective for soft clustering though there are many methods for hard clustering. In order to make clustering stores and items at the same time, it is most effective to construct a statistical model with appropriate probabilistic distributions for both. That is the reason why the latent class model has been applied to this problem.

Latent class models have been applied not only in the information science field, such as purchase history analysis (Sagawa and Hirooka, 2003), document search problems (Hofmann, 1999), and image recognition (Bosch et al., 2006), but also in various fields such as predicting the number of customers visiting a store (Tsukasa et al., 2011b), fashion coordination recommendations (Iwata et al., 2011), and metrological sociology (Fujihara et al., 2012). From the idea of Probabilistic Latent Semantic Analysis (Hofmann, 1999), which is a well-known latent class model for analyzing purchase history data, we expand the model to the problem in this research.

### 3.1 Conception of Proposal

Because retail stores develop shops in multiple re-gions, demand trends and weather conditions are different in each store. Therefore, the effects of the weather conditions on demand may be different depending on the combination of stores and items. Here, let J be the number of items, and $X = { x j : 1 ≤ j ≤ J }$ be the set of items, let I be the number of stores, and $S = { s i : 1 ≤ i ≤ I }$ be the set of stores. The objective problem is to analyze the demand changes caused by the weather condition, taking into account the characteristics of these items and stores.

In the analysis of items, store characteristics, and weather conditions, because item demand fluctuations are affected by store characteristics and weather conditions, each variable affects the others. However, because a huge number of items are sold in a store, there is a problem that estimation accuracy cannot be maintained, from the viewpoint of sample size, for individual regression analysis focusing on each item. This is true for stores as well, in the case of a retailer that develops many stores, it is not practical to deal with these stores separately and build demand models for each store. Normally, stores have similarities, so it is considered that constructing the same demand model for similar stores will increase the number of data samples that can be used for model estimation, and the prediction accuracy will be higher. Therefore, we focus on an analysis involving the function of clustering. This analysis clusters stores and items for which weather conditions and demand fluctuations are similar.

From these points of view, we propose a latent class model to analyze the co-occurrence of items, store characteristics, and weather conditions by supposing the latent class behind them. This makes it possible to analyze stores with similar demand fluctuations for items due to weather conditions, to extract items with high weather sensitivity in similar store groups and to express potential demand fluctuations for items due to changes in weather conditions.

This might directly lead to merchandising strategies such as methods of displaying items and systems for processing items. Also, if we can grasp beforehand which changes in weather conditions are related to the sales of item A, it is possible to properly manage the inventory of item A based on the weather forecast. In addition, if we clarify ahead of time items with a high weather sensitivity at other stores that are similar to a target store, we can identify an item whose sales can potentially change even at the target store. That is, by adding information on other stores, the model can be useful for the merchandising of a target store.

### 3.2 Preparation for Modeling

As a representative model of latent class models, Probabilistic Latent Semantic Analysis (PLSA) is well known (Hofmann, 1999). Applying purchase history data to this model, by supposing a latent class behind items and users, it enables an analysis of the co-occurrence of items and users; let H be the number of users, and $Y = { y h : 1 ≤ h ≤ H }$ be the set of users, let K be the number of latent classes, and $Z = { z k : 1 ≤ k ≤ K }$ be the set of latent classes. Figure 2 shows the graphical model of PLSA when applying it to purchase history data.

The model is based on equation (1), and we estimate the parameters $p ( z k ) , p ( x j | z k ) , and p ( y h | z k )$ in the learning step of the model; however, because there is the latent variable zk that cannot be observed in the model, it is estimated by using the EM algorithm (Miyakawa, 1987).

$p ( x j , y h , z k ) = p ( z k ) p ( x j | z k ) p ( y h | z k )$
(1)

### 3.3 Modeling

This research proposes a latent class model to ex-press the co-occurrence relationship between temperature differences, store characteristics, and items by expanding the PLSA model. This model enables the analysis of the kind of area and temperature differences that will affect item demand.

Here, let t be the temperature difference. The pro-posed probabilistic model is defined by equation (2), shown below.

$p ( x j , s i , t , z k ) = p ( z k ) p ( x j | z k ) p ( s i | z k ) p ( t | z k )$
(2)

This model represents the probability of the event in which item xj is purchased at store si on a day of that temperature difference is t. That is, the occurrence of the data $( x j , s i , t , z k )$ means this event will happen. A graphical model for the proposed model is shown in Figure 3.

Here, supposing a multinomial distribution for items $p ( x j | z k )$ and store characteristics $p ( s i | z k )$. The multinomial distribution is the most flexible probabilistic model for qualitative random variables, so that it is reasonable to assume the multinomial distributions for items $p ( x j | z k )$ and store characteristics $p ( s i | z k )$. Also, suppose a normal distribution, shown in equation (3), for temperature difference t.

(3)

In equation (3), μk and $σ k 2$ show the average and variance of the k-th normal distribution respectively. The joint probability distribution of item xj, store si, and temperature difference t is given by

$p ( x j , s i , t ) = ∑ k = 1 K p ( z k ) p ( x j | z k ) p ( s i | z k ) p ( t | z k ) .$
(4)

### 3.3 Studying Parameters

Let $a n ( ∈ X )$ be the n-th purchased item for all purchased items N, $b n ( ∈ S )$ be the store a purchase was made, and cn be the temperature difference of that day. The n-th purchased data is given by the co-occurrence $( a n , b n , c n )$, then the log-likelihood function for parameter estimation is given by

(5)

which is calculated by using equation (4). The parameters for the proposed method that locally maximize the log-likelihood (5) are estimated by using the following EM algorithm.

(E-step)

(6)

(M-step)

$p ( z k ) = 1 N ∑ n = 1 N p ( z k | a n , b n , c n )$
(7)

(8)

(9)

$μ k = ∑ n = 1 N p ( z k | a n , b n , c n ) c n ∑ n = 1 N p ( z k | a n , b n , c n )$
(10)

$σ k 2 = ∑ n = 1 N p ( z k | a n , b n , c n ) ( c n − μ k ) 2 ∑ n = 1 N p ( z k | a n , b n , c n )$
(11)

In equations (8) and (9), $δ ( x , y )$ is an indicator function that takes 1 if x = y, and otherwise 0. For the derivation of each formula used in the EM algorithm, see Appendix A. The procedure of parameter estimation based on the EM algorithm are shown as follows:

[The procedure of parameter estimation based on the EM algorithm]

• Step 1: Randomly initialize the M step parameters $p ( z k ) , p ( a n | z k , p ( b n | z k ) , μ k , and σ k 2$.

• Step 2: Calculate $p ( z k | a n , b n , c n )$ via equation (6) as the E-step.

• Step 3: Calculate $p ( z k ) , p ( a n | z k ) , p ( b n | z k ) , μ k , and σ k 2$ via equations (7) to (11) as the M-step.

• Step 4: Calculate the log-likelihood by equation (5).

• Step 5: If the log-likelihood shown in equation (5) converges, that is, the change rate of LL becomes sufficiently small, finish the procedure and output the parameters. Otherwise, return to Step 2.

### 3.4 The Analysis Process with the Proposed Model

The design of the analytical process with the pro-posed model in a general point of view is described in this section, so that general marketing managers could better understand how to make use of the proposed model. Most of the retail companies operating many chain stores in recent years have introduced Point of Sales systems and accumulated purchase history data of each customer at the unit of receipts. In this research, it is assumed that this POS data can be utilized for the analysis. The general process of the analysis is shown as follows:

• Step 1: The sales records of all sold items at each retail store are acquired from the raw POS data.

• Step 2: Meteorological data is independently ob-tained, and the average temperature difference from the previous day is calculated from the daily average temperature data.

• Step 3: From the sales date information of each sold item, the information of the temperature difference is attached to the event that an item was sold at a store. Thus, the combinations of an item, a store, and the temperature difference are made for all sold items.

• Step 4: Learning the proposed model by the created combination data of items, stores, and the temperature differences, the trained latent classes are acquired.

• Step 5: The interpretations of each latent class are given through the estimated probabilities of the conditional probabilities of items and stores conditioned on the latent class.

• Step 6: The relationship between items, stores, and temperature differences is analyzed by using the probabilities and the meanings of each latent class.

For specific analysis viewpoints and methods, please refer to the analysis experiment shown in the next chapter.

## 4. ANALYSIS EXPERIMENT

To confirm the effectiveness of the proposed method, we perform an analysis experiment with the proposed model by using weather data and the real purchase history data of a Japanese retailer that develops many stores. This retailer is a common supermarket company operating many grocery stores mainly in the Chubu region of Japan.

Since the proposed model is an analytical model based on the latent class model that has a soft clustering function, the comparison with other methods is directly difficult. Although there are surely various kinds of hard clustering methods for data clustering, most clustering methods are based on the assumption of vector format data. In order to apply other clustering methods, it is necessary to convert the raw data into a vector format. Because we try to make clusters the combination of a store and an item considering the temperature difference, the conventional clustering methods supposing the vector format data cannot be directly applied. If we apply the conventional clustering methods to this problem, we have to construct an appropriate vector space and it is not an easy task. Therefore, it is difficult to show the result of comparison with other methods in an obvious way. In this chapter, we show the result of the proposed model applied to the real purchase history data.

### 4.1 Experimental Condition

The observation period for the purchase history data is from March 1st to March 31st in 2013. The total number of stores is I = 174, the total number of items is J = 3,745, and the total number of purchases is N = 37,166,162. The purchase history data used in the experiment is POS data of grocery stores. In the target grocery stores, they sell the perishables that are generally sold in a supermarket. Compared to general POS data, attribution information of customers such as gender or age is not provided. In other words, only the date-time of the purchase and name of items are provided. Regarding weather data, hourly temperature data are provided based on the location of each shop. We derive the average temperature by averaging them, and by taking subtract of it from the previous day, we get the temperature difference. From these two data, we construct a dataset as described above and use it to train the model. In the parameter estimation in the learning phase, we constructed the program based on the derived update formula. The training time depends on machine performance, but in our single thread CPU machine, it took a few hours to finish it. In addition, for the number of latent classes, we thoroughly search for an appropriate number with a high interpretability of latent class characteristics, and it is set to K = 8. A high interpretability of the demand model is most important for the target retailer to be able to introduce the model into their operations. In fact, most retailers find it challenging to adopt models that are complex and difficult for shopkeepers to interpret. Therefore, in this research, the number of latent classes was decided from the viewpoint of the interpretability of the model. Through repeating preliminary experiments changing the number of latent classes, we gave the interpretation of the latent classes for every setting and selected the best setting.1

### 4.2 Evaluation Method

The purpose of this research is to extract items that are especially influenced by weather conditions by ana-lysing the co-occurrence of items, stores, and temperature difference. On the other hand, from a pre-analysis standpoint, the existence of items purchased in a certain amount on a daily basis was confirmed, regardless of the temperature difference. Therefore, assuming the hypothesis that “there are classes that are not affected by temperature difference,” we will extract highly weather sensitive items by comparing items with classes that are not sensitive to temperature difference. That is, we define weather sensitive items as items that appear in classes affected by temperature difference and do not appear in classes that are not affected by temperature difference.

### 4.3 Experiment Results and Considerations

Whether the probabilities of each data sample belong to the latent classes can be calculated by using the estimated latent class model. Therefore, the characteristics of each latent class are clarified by the averages, based on the belonging probabilities of all data samples. Table 1 shows the result for temperature difference and store characteristics on each latent class. Here, for temperature difference, according to the average μk and variance $σ k 2$ for each class, is expressed as + when temperature ascends, - when it descends, and ± when both could occur. As an image, it is represented as bar chart in the following table. As for store location, the top 15 stores in terms of belonging probability for each class are plotted on the map.

From the result of Table 1, the characteristics of each latent class are confirmed in terms of temperature difference and store location in each class. About temperature difference, because the classes $z 1 − z 3$ have the possibility of both ascending and descending, the items and stores that are less affected by temperature difference appear in these classes. It can be said that these classes correspond to the hypothesis described in the previous section. The classes $z 4 − z 6$ and $z 7 − z 8$ can be interpreted, respectively, as classes in which sales of items rise when the temperature difference ascends and descends.

On analysis of purchase history data, the estimated probabilities of latent classes p(zk) are often concentrated on several latent classes. However, this is a common phenomenon observed in many cases. Because the event that the proposed model represents is the co-occurrence of item, store, and temperature difference, even the latent classes with relatively small probabilities are also estimated by many sample data. One record in the target data is a purchasing log of an item by a customer, and the total number of purchases in the training data set is N = 37,166,162. Even if the probability of a latent class is 0.005, the number of events belonging to this latent class is more than 185,000. Compared with the probabilities of other latent classes, those of several latent classes look small, but these latent classes are not meaningless.

For the analysis result of the items, we first consider the result for the latent classes z1-z3. Table 2 shows the three highest appearance probability items in each class z1-z3.

Since the items shown in Table 2 have a high appearance probability in the classes z1-z3, it can be said that these are the items that increase in sales regardless of the temperature difference. “Bean sprouts” and “cucumber,”

for example, appear across multiple classes, such as in the classes z1 and z2 or z1 and z3. Because each affiliated region differs for each latent class, these are considered to be items that increase in sales regardless of not only temperature difference, but also store location. On the other hand, “spinach” and “broccoli,” for example, do not appear across multiple classes, which means these are items that increase in sales regardless of only temperature difference. In other words, these are the items that are specific to each latent class.

Subsequently, we extract weather sensitive items from each latent class. Tables 3 and 4 show the top three weather sensitive items when temperature ascends and descends, respectively. Note, that as weather sensitive items, among the items with high appearance probabilities in classes $z 4 − z 8$, the top 30 items in terms of appearance probability in the classes $z 1 − z 3$ are excluded because these items have increasing sales irrespective of temperature difference.

In the winter to early spring season in Japan such as from December to March, Japanese tend to eat a hot pot called “nabe” when they feel cold. Nabe contains a lot of vegetables including the items listed in Table 4. Items in Table 4, by the way, are weather sensitive items that have a tendency to increase in terms of sales when the temperature difference descends. From this point of view, the results of Table 4 can be easily understood intuitively.

From Tables 3 and 4, in the classes z4 and z8, “pork for shabu-shabu” and “pork for ginger pork” are extracted, respectively. Also, in the classes z5, z6 and z8, “seaweed sashimi,” “octopus sashimi,” and “assorted sashimi” are extracted, respectively. Generally, perishables like pork and seafood are stocked in blocks and processed in each store, so that the method of provision can be changed according to the predicted demand for each store. Thus, even with the same foodstuff, it is possible to respond to demand by changing the processing method according to the characteristics of the class, i.e., its correspondence to weather conditions.

From the results mentioned above, specific strategies can be proposed. For example, for the stores in which the belonging probability to class z1 is high, spinach sales increase regardless of temperature difference, so strategies such as putting spinach on display at the shop front are considered possible. In the same way, even when the same pork is considered, the stores in which the belonging probability to class z2 is high can cope with demand fluctuation by processing pork for shabu-shabu. Conversely, stores in which the belonging probability to class z8 is high can cope with demand fluctuation by processing pork for ginger pork. In addition, for sashimi, stores in which the belonging probability to z5 or z6 is high can cope with demand fluctuation by serving it as a single item; conversely, stores in which the belonging probability to z8 is high can cope with demand fluctuation by serving the assorted ingredients.

As a result, we confirmed that the analysis of stores with similar demand fluctuations based on weather conditions and extraction of items with high weather sensitivity are effective. Additionally, we confirmed that a concrete strategy can be possible.

## 5. DISCUSSION

In the marketing strategy for retail stores, these stores prefer an extensive analysis, rather than a detailed analysis based on concrete numerical values and events, because it is more practical. For example, information such as how much sales will increase when the temperature rises by one degree is impractical for a retailer. Therefore, it seems an extensive analysis such as listing the items with high weather sensitivity is preferred, rather than a detailed analysis, such as a regression analysis, that predicts the growth of demand for items. From this point of view, the clustering of items that are sensitive to weather conditions, which we produce in this research, is considered useful information. In the daily management of many items on grocery stores, the operators of the company cannot focus on a specific item and have to manage various kinds of items for creating an attractive assortment. That is why we have not focused on the demand prediction problem for each item in this study. The purpose of this research is to construct a model that can analyze the relationship between the items, the stores, and the temperature difference from a holistic viewpoint. After we can identify the important specific items for management, it is possible to consider the prediction of the item demands for each important item at the store level. Since it is difficult for us to treat this type of microanalytical model in this study at the same time, the analysis from the micro viewpoint is future work.

As an example of applying the proposed method to other data, it can be considered applicable to the sales data of restaurants and purchase history data in clothes shops. In restaurants, it is necessary to procure ingredients according to the amount of orders expected on that day. Therefore, the loss of foodstuffs can be reduced by identifying dishes with high weather sensitivity. Also, in clothes shops, it will be possible to extract items with high weather sensitivity, which can lead to marketing strategies such as which items are displayed in conspicuous positions within shops. In particular, since changes in temperature are intense at the turn of each season, an efficient display can be carried out by extracting items with high weather sensitivity.

## 6. CONCLUSION AND FUTURE WORKS

This research proposed the latent class model to analyze the co-occurrence of items, weather conditions, and store characteristics by expanding the PLSA model. In the modelling, we focused on temperature difference as a weather condition because how consumers feel depends on the temperature of the previous day, and it is a useful measurement in real marketing strategy.

Also, by applying real purchase history and weather data to the proposed method, we quantitively analyse a demand fluctuation that considers weather conditions and store characteristics. Especially, by analyzing perishable items, we attempted an analysis that would lead to real marketing strategy. As a result, we confirm the utility of the proposed method by clustering stores with similar demand fluctuations and by extracting highly weather sensitive items. Finally, from the result, we show the possibility of connecting the model to the marketing strategies of real stores at an operation level.

Future works include considering how to decide the proper number of latent classes using measurements such as Akaike Information Criteria (AIC) (Akaike, 1973), expanding the model to consider other weather conditions and quantify weather sensitivity, and studying how best to utilize the results obtained.

## APPENDIX A

Here, the derivation of the EM algorithm for the proposed model is described. At first, the following notations are introduced.

The probability of the complete data set $( A , B , C , V )$ is given by

$p ( A , B , C , V ) = ∏ n = 1 N p ( a n , b n , c n , v n ) = ∏ n = 1 N p ( v n ) p ( a n | v n ) p ( b n | v n ) p ( c n | v n ) ,$

where $a n ∈ X , b n ∈ S , c n ∈ ℝ , v n ∈ Z .$

E-step:

The data of latent class V cannot be observed; $p ( V | A , B , C )$ is prepared to calculate the expectation of the log likelihood.

$p ( V | A , B , C ) = p ( A , B , C , V ) ∑ v n ∈ Z p ( A , B , C , V ) = Π n = 1 N p ( a n , b n , c n , v n ) Π n = 1 N ∑ v n ∈ Z p ( a n , b n , c n , v n ) = ∏ n = 1 N p ( a n , b n , c n , v n ) ∑ v n ∈ Z p ( a n , b n , c n , v n ) = ∏ n = 1 N p ( v n | a n , b n , c n ) .$

Using this probability, the Q-function can be formulated as follow:

Here, let $δ ( α , β )$ be the indicator function that takes 1 if $α = β$ and otherwise 0, the following formulas are derived.

Therefore, Q-function can be formulated as follows:

$Q = ∑ n = 1 N { ∑ k = 1 K p ( z k | a n , b n , c n ) log p ( z k ) + ∑ k = 1 K p ( z k | a n , b n , c n ) ∑ j = 1 J δ ( a n , x j ) log p ( x j | z k ) + ∑ k = 1 K p ( z k | a n , b n , c n ) ∑ i = 1 I δ ( b n , s i ) log p ( s i | z k ) + ∑ k = 1 K p ( z k | a n , b n , c n ) log 1 2 π σ k 2 exp ( − ( log c n − μ k ) 2 2 σ k 2 ) }$

M-step:

On the following constraints, the Q function is maximized with respect to the parameters.

$∑ k = 1 K p ( z k ) = 1 , ∑ j = 1 J p ( x j | z k ) = 1 , ∑ i = 1 I p ( s i | z k ) = 1.$

The method of the Lagrange multiplier is applied to solve the optimization problem.

$L = Q − λ ( ∑ k = 1 K p ( z k ) − 1 ) − ∑ k = 1 K ϕ k ( ∑ k = 1 J p ( x j | z k ) − 1 ) − ∑ k = 1 K θ k ( ∑ i = 1 I p ( s i | z k ) − 1 ) .$

(1) Optimization for $p ( z k )$: From

$∂ L ∂ p ( z k ) = ∑ n = 1 N p ( z k | a n , b n , c n ) 1 p ( z k ) − λ = 0 ,$

we have

$p ( z k ) = 1 λ ∑ n = 1 N p ( z k | a n , b n , c n )$

Because $∑ k = 1 K p ( z k ) = 1$, the equation

is satisfied. Therefore, we have

$λ = ∑ k = 1 K ∑ n = 1 N p ( z k | a n , b n , c n ) = N ,$

so that

$p ( z k ) = 1 N ∑ n = 1 N p ( z k | a n , b n , c n )$

is given.

(2) Optimization for $p ( x j | z k ) and p ( s i | z k )$: From

$∂ L ∂ p ( x j | z k ) = ∑ n = 1 N δ ( a n , x j ) p ( z k | a n , b n , c n ) 1 p ( x j | z k ) − ϕ k = 0 ,$

we have

$p ( x j | z k ) = 1 ϕ k ∑ n = 1 N δ ( a n , x j ) p ( z k | a n , b n , c n ) = 1.$

Because $∑ p ( x j | z k ) = 1$, the equation

$∑ j = 1 J p ( x j | z k ) = ∑ j = 1 J 1 ϕ k ∑ n = 1 N δ ( a n , x j ) p ( z k | a n , b n , c n ) = 1$

is satisfied. Therefore, we have

$ϕ k = ∑ j = 1 J ∑ n = 1 N δ ( a n , x j ) p ( z k | a n , b n , c n ) = N p ( z k ) ,$

so that

$p ( x j | z k ) = 1 N p ( z k ) ∑ n = 1 N δ ( a n , x j ) p ( z k | a n , b n , c n )$

is given. Similarly, we have

$p ( s i | z k ) = 1 N p ( z k ) ∑ n = 1 N δ ( b n , s i ) p ( z k | a n , b n , c n )$

(3) Optimization for μk:

From

$∂ L ∂ μ k = ∑ k = 1 K p ( z k | a n , b n , c n ) ( log c n − μ k ) σ k 2 = 0$

we have

$μ k = 1 N p ( z k ) ∑ n = 1 N p ( z k | a n , b n , c n ) log c n$

(4) Optimization for $σ k 2$:

From

$∂ L ∂ σ k 2 = ∑ n = 1 N p ( z k | a n , b n , c n ) { ( log c n − μ k ) 2 2 ( σ k 2 ) 2 − 1 2 σ k 2 } = 0$

we have

$σ k 2 = 1 N p ( z k ) ∑ n = 1 N p ( z k | a n , b n , c n ) ( log c n − μ k ) 2$

Yuto Seko was a graduate student at Waseda Uni-versity in Japan at the time of writing this paper. He received his master's degree in department of Industrial and Management Systems Engineering from Waseda University in 2020. His research interests include ma-chine learning, statistics, and their applications.

Ryotaro Shimizu is now a graduate student of doctoral course at Waseda University, Japan. He received his master's degree in department of Industrial and Management Systems Engineering from Waseda University in 2019. His research interests include machine learning, artificial intelligence, and the applications of advanced analytics in business section.

Gendo Kumoi is a research associate in the depart-ment of Industrial and Management Systems Engineering, Waseda University, Japan. He is studying in the field of applied information mathematics, machine learning, and text mining. He is a member of IEEE, Information Processing Society of Japan, and etc.

Tomohiro Yoshikai is now working for Japan Weather Association. He is in charge of data analysis in the field of disaster prevention using weather radar data and product demand forecasting projects using weather information. He is a specialist of data analytics related with weather data and its application to various fields.

Masayuki Goto is a professor in the department of Industrial and Management Systems Engineering, Waseda University, Japan. He received his Dr.E. degree from Waseda University in 2000. He is studying in the field of data science, business analytics, machine learning, and Bayesian statistics. He is now a director of the Research Institute of Data Science, Waseda University. He has won several best paper awards at several conferences such as the 20th Asia Pacific Industrial Engineering and Management Systems (APIEMS 2019) and 16th Asian Network for Quality Congress.

## ACKNOWLEDGEMENT

The authors would like to express their gratitude to the Japan Weather Association for providing us the data and their helpful comments.

## Figure

Yearly change of temperature in Japan.

Graphical model for PLSA.

Graphical model for proposed method.

Chubu region, Japan (Colored area).

## Table

Trend for store location and temperature difference in each latent class

Top three appearance probability items in class

Top three weather sensitive items when temperature ascends

Top three weather sensitive items when temperature descends

## REFERENCES

1. Abe, M. and Kondo, F. (2005), Science of Marketing - Analysis of POS Data, Asakura Shoten. in Japanese.
2. Adalbjörnsson, S., Swärd, J., Berg, M. Ö., Andersen, S. V., and Jakobsson, A. (2016), Conjugate priors for Gaussian emission PLSA recommender systems, Proceedings of the 24th European Signal Processing Conference (EUSIPCO), Budapest, Hungary.
3. Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, Proceedings of the 2nd International Symposium on Information Theory, Budapest, 267-281.
4. Bhatnagar, A. and Ghose, S. (2004), A latent class segmentation analysis of e-shoppers, Journal of Business Research, 57(7), 758-767.
5. Bishop, C. M. (2006), Pattern Recognition and Machine Learning, Springer.
6. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003), Latent dirichlet allocation, Journal of Machine Learning Research, 3, 993-1022.
7. Bosch, A., Zisserman, A., and Muñoz, X. (2006), Scene classification via pLSA, Proceedings of the ECCV, 517-530.
8. Fujihara, S., Ito, T., and Tanioka, K. (2012), Quantitative sociological approaches using the latent class analysis: Data Analyses of status inconsistency, attitude to social inequality, and authoritarian-conservatism, Annals of Human Sciences, 33, 43-68.
9. Goto, M. and Kobayashi, M. (2014), Introduction to Pattern Analysis and Machine Learning, Corona Inc., in Japanese.
10. Goto, M., Komiya, Y., Ishida, T., and Masui, T. (2012), A predictive model of number of customers for restaurant chain based on Bayesian model averaging, Innovation and Supply Chain Management, 6(3), 91-98.
11. Goto, M., Mikawa, K., Hirasawa, S., Kobayashi, M., Suko, T., and Horii, S. (2015), A new latent class model for analysis of purchasing and browsing histories on EC sites, Industrial Engineering & Management Systems, 14(4), 335-346.
12. Green, P. E., Carmone, F. J., and Wachspress, D. P. (1976), Consumer segmentation via latent class analysis, Journal of Consumer Research, 3(3), 170-174
13. Hagenaars, J. A. and McCutcheon, A. L. (2002), Applied Latent Class Analysis, Cambridge University Press.
14. Hoffmann, T. and Puzicha, J. (1999), Latent class models for collaborative filtering, IJCAI, 99(1999), 688-693.
15. Hofmann, T. (1999), Probabilistic latent semantic analysis, Proceedings of the UAI '99, 289-296.
16. Hofmann, T. (2004), Latent semantic models for collaborative filtering, ACM Transactions on Information Systems (TOIS), 22 (1), 89-115.
17. Ishigaki, T. (2011), Automatic extraction method of category bell-dependent variable relationships from POS data with department store ID, Public Interest Corporation Association of Japan Operations Research, 56(2), 77-83.
18. Iwata, T., Watanabe, S., Yamada, T., and Ueda, N. (2009), Topic tracking model for analyzing consumer purchase behavior, Proceedings of the 21st International Joint Conference on Artificial Intelligence, 11-17.
19. Iwata, T., Watanabe, S., and Sawada, H. (2011), Fashion coordinates recommender system using photographs from fashion magazines, Proceedings of the IJCAI, 2262-2267.
20. Jermsittiparsert, K., Sutduean, J., and Sriyalul, T. (2019), Effect of service innovation and market intelligence on supply chain performance in Indonesian fishing industry, Industrial Engineering & Management Systems, 18(3), 407-416.
21. Jin, R., Si, L., and Zhai, C. (2006), A study of mixture models for collaborative filtering, Journal of Information Retrieval, 9 (3), 357-382.
22. Jin, R., Si, L., and Zhai, C. X. (2003), Preference-based graphic models for collaborative filtering, Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence (UAI’03), 329-336.
23. Magidson, J. and Vermunt, J. K. (2002), Latent class models for clustering: A comparison with k-means, Canadian Journal of Marketing Research, 20(1), 37-44.
24. Miyakawa, M. (1987), The EM algorithm and its related problems, Japanese Journal of Applied Statistics, 16(1), 1-21.
25. Mohammadi, M. and Sohrabi, T. (2018), Examining the effect of marketing mix elements on customer satisfaction with mediating role of electronic customer relationship management, Industrial Engineering & Management Systems, 17(4), 653-661.
26. Nagamori, S., Mikawa, K., Goto, M., and Ogihara, T. (2019), An analytic model to represent relation between finish date of job-hunting and time series variation of entry tendencies, Industrial Engineering & Management Systems, 18(3), 292-304.
27. Nakayama, A. (2003), Consideration of sales floor placement in stores using POS data, Operations Research,48(2), 100-106.
28. Normalini, M. K., Ramayah, T., and Shabbir, M. S. (2019), Investigating the impact of security factors in e-business and internet banking usage intention among Malaysians, Industrial Engineering & Management Systems, 18(3), 501-510.
29. Okayama, S., Yamashita, H., Mikawa, K., Goto, M., and Yoshikai, T. (2019), Relational analysis model of weather conditions and sales patterns based on nonnegative matrix factorization, International Journal of Production Research, 58(8), 2477-2489.
30. Sagawa, M. and Hirooka, Y. (2003), Marketing / Data Analysis, Asakura Shoten.in Japanese.
31. Si, L. and Jin, R. (2003), Flexible mixture model for collaborative filtering, Proceedings of the 20th International Conference on Machine Learning (ICML’03), Washington DC, 704-711.
32. Suzuki, T., Kumoi, G., Mikawa, K., and Goto, M. (2014), A design of recommendation based on flexible mixture model considering purchasing interest and post-purchase satisfaction, Journal of Japan Industrial Management Association, 64 (4E), 570-578
33. Swait, J. and Adnamowicz, W. (2001), The influence of task complexity on consumer choice: A latent class model of decision strategy switching, Journal of Consumer Research, 28(1), 135-148.
34. Train, K. (2009), Discrete Choice Methods with Simulation, Cambridge University Press.
35. Tsukasa, I., Takenaka, T., and Motomura, Y. (2011a), Customer behavior prediction system by large scale data fusion in a retail service, Journal of Artificial Intelligence Society, 26(6), 670-681.
36. Tsukasa, I., Takenaka, T., and Motomura, Y. (2011b), Improvement of prediction accuracy of the number of customers by latent class model, Proceedings of the 25th Annual Conference of the Japanese Society for Artificial Intelligence, 1B3-2.
37. Ueda, S. (2004), Probabilistic model of multiple topic text: The forefront of text model research, Information Processing Society, 45(3), 282-289.
38. Xue, G. R., Dai, W., Yang, Q., and Yu, Y. (2008), Topic-bridged PLSA for cross-domain text classification, Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 627-634,
39. Yamamoto, Y., Mikawa, K., and Goto, M. (2017), A proposal for classification of document data with unobserved categories considering latent topics, Industrial Engineering & Management Systems, 16(2), 165-174.
 Do not open for a day Close