1. INTRODUCTION
In retail stores, such as supermarkets, predicting consumer demand fluctuation is an important task for coping with the changes in daily life. In recent years, by utilizing a large amount of accumulated purchase history data, there is a growing need for predictions of daily changes in demand fluctuation and acquisition of information, leading to marketing strategies (Sagawa and Hirooka, 2003;Tsukasa et al., 2011a;Abe and Kondo, 2005). Conventionally, the questionnaire survey is sometimes made use of the analysis in marketing field (For example, see Mohammadi and Sohrabi, 2018;Jermsittiparsert et al., 2019;Normalini et al., 2019). Though the usefulness of the questionnaire survey has not forfeited, the accumulated purchase history data has also become an important information source for marketing analysis these days. For example, by analyzing POS (point of sales) data, which includes the information about when, where, which items, and how many items are sold, it is possible to predict consumer demand, leading to efficient strategies (Nakayama, 2003;Goto et al., 2012).
However, such demand fluctuation is thought to be affected by changes in consumer preferences and external factors. In particular, store characteristics and weather conditions, which are closely related to consumer behavior, are thought to have strong effects on item demands. For this reason, it is very important to quantitatively grasp demand fluctuations for items that are influenced by changes in weather conditions (what we call “weather sensitive items”) for each store using an integrated analysis of purchase history data from many stores and weather conditions. However, in the case of a retailer chain that has many stores, construction of prediction models for every store is problematic in that it leads to very complicated models, and there is a possibility that there will not be a sufficient amount of data to ensure the creation of each model.
On the other hand, the latent class model (Green et al., 1976;Hoffman and Puzicha, 1999; Swait and Adnmowicz, 2001; Bhatnagar and Ghose, 2004;Bishop, 2006;Goto and Kobayashi, 2014;Goto et al. 2015) is well known as an effective model for analyzing marketing data, which are collection of different characteristics. In the latent class model, latent variables are assumed between observation variables, and it becomes possible to model the relation between heterogeneous data. Applying the latent class model to purchase history data, assuming latent variables between users and items, it is possible to model a heterogeneous group of users and items. In the marketing field, the assumption that the market consists of several user segments with similar preferences and plural item groups with similar characteristics is usually reasonable. Therefore, the latent class model is a very compatible model for market analysis. Also, it enables one to understand the huge number of users and items in terms of latent variables (Hoffman and Puzicha, 1999; Ishigaki, 2011). Since the influence of fluctuations in weather conditions on item demands can differ depending on the characteristics of stores, such as regional characteristics and the main types of customers, an analytical model taking account of this issue is necessary.
In this research, we propose a latent class model to express the relationship between weather conditions, store characteristics, and item demand fluctuation. This makes it possible to analyze the cooccurrence relationships of these variables. In addition, through an analysis experiment using an actual data set, we show the usefulness of the proposed model by extracting weather sensitive items. As a result, it is possible to quantitatively understand fluctuations in demand, which will help marketing strategies, including item management and inventory control, in real stores.
2. PREPARATION
In this section, we discuss weather conditions, store characteristics, item categories, and the analysis period as a preparation for modelling.
2.1 Weather Conditions
In this research, we focus on temperature difference, which is the difference in average temperature from the previous day, as a weather condition affecting item sales. Even if the average temperature on one day is the same as another day, consumer feelings may differ from one day to another. This is because their feelings about the weather depend on its difference from the previous day. Consumers tend to feel “cold” if it was warm on the previous day; conversely, they feel “warm” if it was cold. As mentioned above, an analysis focusing only on average temperature cannot account for the effects of temperature change.
Therefore, focusing on temperature difference can be a useful index for a marketing strategy. This is because the average temperature on the previous day, which is a reference, is the observation variable, and we can focus on the ascent/descent from the observed temperature. In fact, temperature difference is a weather condition that can be easily analyzed in real stores.
2.2 Store Characteristics
In order to extract store characteristics, we make use of the customer purchase data for each store. Normally, because the characteristics of each store’s main customer differ between stores and depend on location conditions, items demanded are also different. Therefore, we try to take into account a store’s characteristics in constructing a model of item demands.
In addition, when we analyze the relation between weather conditions and item demand in each store, it leads to the discovery of other stores with similar trends and weather conditions. This analysis may contribute to identifying stores with similar potential demand trends. For example, let us assume that sales trends and the effect of weather conditions at Store A are similar to those of Store B. In this example, it is thought that items that are selling well in Store B will potentially grow as items that can also be sold at Store A. In this way, it is considered that potential demand trends can be extracted by finding similar stores.
2.3 Item Categories and Analysis Periods
In this research, we focus on perishables as an item category to be studied. The demand for perishables is considered to be affected by weather conditions, and inventory control for these items is important because longterm preservation is difficult. Several aspects of the processing and packaging methods of perishables provided to consumers can be changed at the operation level, such as in each store. For this reason, these items can flexibly respond to changes in demand, which is useful information that directly leads to merchandising strategy.
As the analysis period in this study, a short period within a year is targeted, such as a month. Figure 1 shows the yearly changes in the average temperature in Japan.
As Figure 1 shows, Japan has four seasons, and there is a considerable difference in temperature depending on the season. Therefore, the sales of seasonal items have a strong correlation with temperature over a longterm period, such as the twelve months a year. Therefore, it is thought to be difficult to extract items that are not affected by season from an analysis of sales amounts over a longterm period. Though the analysis of seasonal products is also important, this research focused on demand fluctuation of perishables, which should be managed at the operation level.
2.4 Related Work
We focus on the problem of how to analyze the relationship between weather conditions, store characteristics, and item demand fluctuation. Conventionally, the demand forecast problem for individual products has been discussed for a long time, and many types of researches have been conducted. The linear regression analysis with the shop information and weather conditions as explanatory variables is a conventional prediction model with the simplest structure. Usually, the regression models by using several types of machine learning can be applied to the same problem. In addition, the time series models, such as the AR (Autoregressive), ARMA (Autoregressive moving average), and ARIMA (Autoregressive integrated moving average) models can be applied to this problem. The above models can be applied to the problem of constructing the demand forecasting models for each item and shop independently. However, such modeling is not effective in practice because there are many chain stores and also many items to be managed. It is difficult for fieldlevel retail staff to make use of the vast number of predictive models during their daily hectic tasks.
On the other hand, Okayama et al. (2019) proposed an analytical model based on Nonnegative Tensor Factorization with a tensor represented by a threedimensional array of ‘dates’, ‘items’, and ‘stores’ (Okayama et al., 2019). This method provides a straightforward model representing the relationship between weather conditions, store location, and items. Their model enables to identify important items for each shop cluster and allow a multifaced analysis. However, their model is based on tensor decomposition, so that it is not a generative model. This paper proposed a latent class model to represent the relationship between items, temperature differences, and stores that enable to give a probabilistic interpretation.
A latent class model (Hagenaars and McCutcheon, 2002;Magidson and Vermunt, 2002;Bishop, 2006) is an effective tool to analyze realistic complex problems such as heterogeneous data being mixed. It assumes that the data is generated by following a mixture model of several different probability distributions. The latent class model is shown to be useful for analysis of text data (Hofmann, 1999;Blei et al., 2003;Ueda, 2004;Xue et al., 2008;Yamamoto et al., 2017), collaborative filtering (Jin et al., 2003; Si and Jin, 2003;Hofmann, 2004;Jin et al., 2006;Suzuki et al., 2014;Adalbjörnsson et al., 2016) and analysis of marketing data (Green et al., 1976; Swait and Adnmowicz, 2001; Iwata et al., 2009;Train, 2009;Goto et al., 2015;Nagamori et al., 2019) in many previous researches.
3. PROPOSED ANALYSIS METHOD
In this research, we propose a latent class model (Bishop, 2006;Goto and Kobayashi, 2014), which enables an analysis of the cooccurrence of items, temperature differences, and store characteristics from accumulated purchase history and weather data. To analyze the overall sales trend for all stores, it is considered that a method for clustering stores and items with high similarity in terms of weather sensitivity at the same time is very useful. The latent class model is most effective for soft clustering though there are many methods for hard clustering. In order to make clustering stores and items at the same time, it is most effective to construct a statistical model with appropriate probabilistic distributions for both. That is the reason why the latent class model has been applied to this problem.
Latent class models have been applied not only in the information science field, such as purchase history analysis (Sagawa and Hirooka, 2003), document search problems (Hofmann, 1999), and image recognition (Bosch et al., 2006), but also in various fields such as predicting the number of customers visiting a store (Tsukasa et al., 2011b), fashion coordination recommendations (Iwata et al., 2011), and metrological sociology (Fujihara et al., 2012). From the idea of Probabilistic Latent Semantic Analysis (Hofmann, 1999), which is a wellknown latent class model for analyzing purchase history data, we expand the model to the problem in this research.
3.1 Conception of Proposal
Because retail stores develop shops in multiple regions, demand trends and weather conditions are different in each store. Therefore, the effects of the weather conditions on demand may be different depending on the combination of stores and items. Here, let J be the number of items, and $\mathcal{X}=\{{x}_{j}:1\le j\le J\}$ be the set of items, let I be the number of stores, and $\mathcal{S}=\{{s}_{i}:1\le i\le I\}$ be the set of stores. The objective problem is to analyze the demand changes caused by the weather condition, taking into account the characteristics of these items and stores.
In the analysis of items, store characteristics, and weather conditions, because item demand fluctuations are affected by store characteristics and weather conditions, each variable affects the others. However, because a huge number of items are sold in a store, there is a problem that estimation accuracy cannot be maintained, from the viewpoint of sample size, for individual regression analysis focusing on each item. This is true for stores as well, in the case of a retailer that develops many stores, it is not practical to deal with these stores separately and build demand models for each store. Normally, stores have similarities, so it is considered that constructing the same demand model for similar stores will increase the number of data samples that can be used for model estimation, and the prediction accuracy will be higher. Therefore, we focus on an analysis involving the function of clustering. This analysis clusters stores and items for which weather conditions and demand fluctuations are similar.
From these points of view, we propose a latent class model to analyze the cooccurrence of items, store characteristics, and weather conditions by supposing the latent class behind them. This makes it possible to analyze stores with similar demand fluctuations for items due to weather conditions, to extract items with high weather sensitivity in similar store groups and to express potential demand fluctuations for items due to changes in weather conditions.
This might directly lead to merchandising strategies such as methods of displaying items and systems for processing items. Also, if we can grasp beforehand which changes in weather conditions are related to the sales of item A, it is possible to properly manage the inventory of item A based on the weather forecast. In addition, if we clarify ahead of time items with a high weather sensitivity at other stores that are similar to a target store, we can identify an item whose sales can potentially change even at the target store. That is, by adding information on other stores, the model can be useful for the merchandising of a target store.
3.2 Preparation for Modeling
As a representative model of latent class models, Probabilistic Latent Semantic Analysis (PLSA) is well known (Hofmann, 1999). Applying purchase history data to this model, by supposing a latent class behind items and users, it enables an analysis of the cooccurrence of items and users; let H be the number of users, and $\mathcal{Y}=\{{y}_{h}:1\le h\le H\}$ be the set of users, let K be the number of latent classes, and $\mathcal{Z}=\{{z}_{k}:1\le k\le K\}$ be the set of latent classes. Figure 2 shows the graphical model of PLSA when applying it to purchase history data.
The model is based on equation (1), and we estimate the parameters $p({z}_{k}),\hspace{0.17em}p({x}_{j}\text{}{z}_{k}),\hspace{0.17em}\text{and}\hspace{0.17em}p({y}_{h}\text{}{z}_{k})$ in the learning step of the model; however, because there is the latent variable z_{k} that cannot be observed in the model, it is estimated by using the EM algorithm (Miyakawa, 1987).
3.3 Modeling
This research proposes a latent class model to express the cooccurrence relationship between temperature differences, store characteristics, and items by expanding the PLSA model. This model enables the analysis of the kind of area and temperature differences that will affect item demand.
Here, let t be the temperature difference. The proposed probabilistic model is defined by equation (2), shown below.
This model represents the probability of the event in which item x_{j} is purchased at store s_{i} on a day of that temperature difference is t. That is, the occurrence of the data $({x}_{j},\hspace{0.17em}{s}_{i},\hspace{0.17em}t,\hspace{0.17em}{z}_{k})$ means this event will happen. A graphical model for the proposed model is shown in Figure 3.
Here, supposing a multinomial distribution for items $p({x}_{j}\text{}{z}_{k})$ and store characteristics $p({s}_{i}\text{}{z}_{k})$. The multinomial distribution is the most flexible probabilistic model for qualitative random variables, so that it is reasonable to assume the multinomial distributions for items $p({x}_{j}\text{}{z}_{k})$ and store characteristics $p({s}_{i}\text{}{z}_{k})$. Also, suppose a normal distribution, shown in equation (3), for temperature difference t.
In equation (3), μ_{k} and ${\sigma}_{k}^{2}$ show the average and variance of the kth normal distribution respectively. The joint probability distribution of item x_{j}, store s_{i}, and temperature difference t is given by
3.3 Studying Parameters
Let ${a}_{n}(\in \mathcal{X})$ be the nth purchased item for all purchased items N, ${b}_{n}(\in \mathcal{S})$ be the store a purchase was made, and c_{n} be the temperature difference of that day. The nth purchased data is given by the cooccurrence $({a}_{n},\hspace{0.17em}{b}_{n},\hspace{0.17em}{c}_{n})$, then the loglikelihood function for parameter estimation is given by
which is calculated by using equation (4). The parameters for the proposed method that locally maximize the loglikelihood (5) are estimated by using the following EM algorithm.
(Estep)
(Mstep)
In equations (8) and (9), $\delta (x,\hspace{0.17em}y)$ is an indicator function that takes 1 if x = y, and otherwise 0. For the derivation of each formula used in the EM algorithm, see Appendix A. The procedure of parameter estimation based on the EM algorithm are shown as follows:
[The procedure of parameter estimation based on the EM algorithm]

Step 1: Randomly initialize the M step parameters $p\left({z}_{k}\right),p({a}_{n}{z}_{k},p({b}_{n}{z}_{k}),{\mu}_{k},\hspace{0.17em}\text{and}\hspace{0.17em}{\sigma}_{k}^{2}$.

Step 2: Calculate $p({z}_{k}{a}_{n},\hspace{0.17em}{b}_{n},\hspace{0.17em}{c}_{n})$ via equation (6) as the Estep.

Step 3: Calculate $p\left({z}_{k}\right),p({a}_{n}{z}_{k}),p({b}_{n}{z}_{k}),\hspace{0.17em}{\mu}_{k},\hspace{0.17em}\text{and}\hspace{0.17em}{\sigma}_{k}^{2}$ via equations (7) to (11) as the Mstep.

Step 4: Calculate the loglikelihood by equation (5).

Step 5: If the loglikelihood shown in equation (5) converges, that is, the change rate of LL becomes sufficiently small, finish the procedure and output the parameters. Otherwise, return to Step 2.
3.4 The Analysis Process with the Proposed Model
The design of the analytical process with the proposed model in a general point of view is described in this section, so that general marketing managers could better understand how to make use of the proposed model. Most of the retail companies operating many chain stores in recent years have introduced Point of Sales systems and accumulated purchase history data of each customer at the unit of receipts. In this research, it is assumed that this POS data can be utilized for the analysis. The general process of the analysis is shown as follows:

Step 1: The sales records of all sold items at each retail store are acquired from the raw POS data.

Step 2: Meteorological data is independently obtained, and the average temperature difference from the previous day is calculated from the daily average temperature data.

Step 3: From the sales date information of each sold item, the information of the temperature difference is attached to the event that an item was sold at a store. Thus, the combinations of an item, a store, and the temperature difference are made for all sold items.

Step 4: Learning the proposed model by the created combination data of items, stores, and the temperature differences, the trained latent classes are acquired.

Step 5: The interpretations of each latent class are given through the estimated probabilities of the conditional probabilities of items and stores conditioned on the latent class.

Step 6: The relationship between items, stores, and temperature differences is analyzed by using the probabilities and the meanings of each latent class.
For specific analysis viewpoints and methods, please refer to the analysis experiment shown in the next chapter.
4. ANALYSIS EXPERIMENT
To confirm the effectiveness of the proposed method, we perform an analysis experiment with the proposed model by using weather data and the real purchase history data of a Japanese retailer that develops many stores. This retailer is a common supermarket company operating many grocery stores mainly in the Chubu region of Japan.
Since the proposed model is an analytical model based on the latent class model that has a soft clustering function, the comparison with other methods is directly difficult. Although there are surely various kinds of hard clustering methods for data clustering, most clustering methods are based on the assumption of vector format data. In order to apply other clustering methods, it is necessary to convert the raw data into a vector format. Because we try to make clusters the combination of a store and an item considering the temperature difference, the conventional clustering methods supposing the vector format data cannot be directly applied. If we apply the conventional clustering methods to this problem, we have to construct an appropriate vector space and it is not an easy task. Therefore, it is difficult to show the result of comparison with other methods in an obvious way. In this chapter, we show the result of the proposed model applied to the real purchase history data.
4.1 Experimental Condition
The observation period for the purchase history data is from March 1^{st} to March 31^{st} in 2013. The total number of stores is I = 174, the total number of items is J = 3,745, and the total number of purchases is N = 37,166,162. The purchase history data used in the experiment is POS data of grocery stores. In the target grocery stores, they sell the perishables that are generally sold in a supermarket. Compared to general POS data, attribution information of customers such as gender or age is not provided. In other words, only the datetime of the purchase and name of items are provided. Regarding weather data, hourly temperature data are provided based on the location of each shop. We derive the average temperature by averaging them, and by taking subtract of it from the previous day, we get the temperature difference. From these two data, we construct a dataset as described above and use it to train the model. In the parameter estimation in the learning phase, we constructed the program based on the derived update formula. The training time depends on machine performance, but in our single thread CPU machine, it took a few hours to finish it. In addition, for the number of latent classes, we thoroughly search for an appropriate number with a high interpretability of latent class characteristics, and it is set to K = 8. A high interpretability of the demand model is most important for the target retailer to be able to introduce the model into their operations. In fact, most retailers find it challenging to adopt models that are complex and difficult for shopkeepers to interpret. Therefore, in this research, the number of latent classes was decided from the viewpoint of the interpretability of the model. Through repeating preliminary experiments changing the number of latent classes, we gave the interpretation of the latent classes for every setting and selected the best setting.1
4.2 Evaluation Method
The purpose of this research is to extract items that are especially influenced by weather conditions by analysing the cooccurrence of items, stores, and temperature difference. On the other hand, from a preanalysis standpoint, the existence of items purchased in a certain amount on a daily basis was confirmed, regardless of the temperature difference. Therefore, assuming the hypothesis that “there are classes that are not affected by temperature difference,” we will extract highly weather sensitive items by comparing items with classes that are not sensitive to temperature difference. That is, we define weather sensitive items as items that appear in classes affected by temperature difference and do not appear in classes that are not affected by temperature difference.
4.3 Experiment Results and Considerations
Whether the probabilities of each data sample belong to the latent classes can be calculated by using the estimated latent class model. Therefore, the characteristics of each latent class are clarified by the averages, based on the belonging probabilities of all data samples. Table 1 shows the result for temperature difference and store characteristics on each latent class. Here, for temperature difference, according to the average μ_{k} and variance ${\sigma}_{k}^{2}$ for each class, is expressed as + when temperature ascends,  when it descends, and ± when both could occur. As an image, it is represented as bar chart in the following table. As for store location, the top 15 stores in terms of belonging probability for each class are plotted on the map.
From the result of Table 1, the characteristics of each latent class are confirmed in terms of temperature difference and store location in each class. About temperature difference, because the classes ${z}_{1}{z}_{3}$ have the possibility of both ascending and descending, the items and stores that are less affected by temperature difference appear in these classes. It can be said that these classes correspond to the hypothesis described in the previous section. The classes ${z}_{4}{z}_{6}$ and ${z}_{7}{z}_{8}$ can be interpreted, respectively, as classes in which sales of items rise when the temperature difference ascends and descends.
On analysis of purchase history data, the estimated probabilities of latent classes p(z_{k}) are often concentrated on several latent classes. However, this is a common phenomenon observed in many cases. Because the event that the proposed model represents is the cooccurrence of item, store, and temperature difference, even the latent classes with relatively small probabilities are also estimated by many sample data. One record in the target data is a purchasing log of an item by a customer, and the total number of purchases in the training data set is N = 37,166,162. Even if the probability of a latent class is 0.005, the number of events belonging to this latent class is more than 185,000. Compared with the probabilities of other latent classes, those of several latent classes look small, but these latent classes are not meaningless.
For the analysis result of the items, we first consider the result for the latent classes z_{1}z_{3}. Table 2 shows the three highest appearance probability items in each class z_{1}z_{3}.
Since the items shown in Table 2 have a high appearance probability in the classes z_{1}z_{3}, it can be said that these are the items that increase in sales regardless of the temperature difference. “Bean sprouts” and “cucumber,”
for example, appear across multiple classes, such as in the classes z_{1} and z_{2} or z_{1} and z_{3}. Because each affiliated region differs for each latent class, these are considered to be items that increase in sales regardless of not only temperature difference, but also store location. On the other hand, “spinach” and “broccoli,” for example, do not appear across multiple classes, which means these are items that increase in sales regardless of only temperature difference. In other words, these are the items that are specific to each latent class.
Subsequently, we extract weather sensitive items from each latent class. Tables 3 and 4 show the top three weather sensitive items when temperature ascends and descends, respectively. Note, that as weather sensitive items, among the items with high appearance probabilities in classes ${z}_{4}{z}_{8}$, the top 30 items in terms of appearance probability in the classes ${z}_{1}{z}_{3}$ are excluded because these items have increasing sales irrespective of temperature difference.
In the winter to early spring season in Japan such as from December to March, Japanese tend to eat a hot pot called “nabe” when they feel cold. Nabe contains a lot of vegetables including the items listed in Table 4. Items in Table 4, by the way, are weather sensitive items that have a tendency to increase in terms of sales when the temperature difference descends. From this point of view, the results of Table 4 can be easily understood intuitively.
From Tables 3 and 4, in the classes z_{4} and z_{8}, “pork for shabushabu” and “pork for ginger pork” are extracted, respectively. Also, in the classes z_{5}, z_{6} and z_{8}, “seaweed sashimi,” “octopus sashimi,” and “assorted sashimi” are extracted, respectively. Generally, perishables like pork and seafood are stocked in blocks and processed in each store, so that the method of provision can be changed according to the predicted demand for each store. Thus, even with the same foodstuff, it is possible to respond to demand by changing the processing method according to the characteristics of the class, i.e., its correspondence to weather conditions.
From the results mentioned above, specific strategies can be proposed. For example, for the stores in which the belonging probability to class z_{1} is high, spinach sales increase regardless of temperature difference, so strategies such as putting spinach on display at the shop front are considered possible. In the same way, even when the same pork is considered, the stores in which the belonging probability to class z_{2} is high can cope with demand fluctuation by processing pork for shabushabu. Conversely, stores in which the belonging probability to class z_{8} is high can cope with demand fluctuation by processing pork for ginger pork. In addition, for sashimi, stores in which the belonging probability to z_{5} or z_{6} is high can cope with demand fluctuation by serving it as a single item; conversely, stores in which the belonging probability to z_{8} is high can cope with demand fluctuation by serving the assorted ingredients.
As a result, we confirmed that the analysis of stores with similar demand fluctuations based on weather conditions and extraction of items with high weather sensitivity are effective. Additionally, we confirmed that a concrete strategy can be possible.
5. DISCUSSION
In the marketing strategy for retail stores, these stores prefer an extensive analysis, rather than a detailed analysis based on concrete numerical values and events, because it is more practical. For example, information such as how much sales will increase when the temperature rises by one degree is impractical for a retailer. Therefore, it seems an extensive analysis such as listing the items with high weather sensitivity is preferred, rather than a detailed analysis, such as a regression analysis, that predicts the growth of demand for items. From this point of view, the clustering of items that are sensitive to weather conditions, which we produce in this research, is considered useful information. In the daily management of many items on grocery stores, the operators of the company cannot focus on a specific item and have to manage various kinds of items for creating an attractive assortment. That is why we have not focused on the demand prediction problem for each item in this study. The purpose of this research is to construct a model that can analyze the relationship between the items, the stores, and the temperature difference from a holistic viewpoint. After we can identify the important specific items for management, it is possible to consider the prediction of the item demands for each important item at the store level. Since it is difficult for us to treat this type of microanalytical model in this study at the same time, the analysis from the micro viewpoint is future work.
As an example of applying the proposed method to other data, it can be considered applicable to the sales data of restaurants and purchase history data in clothes shops. In restaurants, it is necessary to procure ingredients according to the amount of orders expected on that day. Therefore, the loss of foodstuffs can be reduced by identifying dishes with high weather sensitivity. Also, in clothes shops, it will be possible to extract items with high weather sensitivity, which can lead to marketing strategies such as which items are displayed in conspicuous positions within shops. In particular, since changes in temperature are intense at the turn of each season, an efficient display can be carried out by extracting items with high weather sensitivity.
6. CONCLUSION AND FUTURE WORKS
This research proposed the latent class model to analyze the cooccurrence of items, weather conditions, and store characteristics by expanding the PLSA model. In the modelling, we focused on temperature difference as a weather condition because how consumers feel depends on the temperature of the previous day, and it is a useful measurement in real marketing strategy.
Also, by applying real purchase history and weather data to the proposed method, we quantitively analyse a demand fluctuation that considers weather conditions and store characteristics. Especially, by analyzing perishable items, we attempted an analysis that would lead to real marketing strategy. As a result, we confirm the utility of the proposed method by clustering stores with similar demand fluctuations and by extracting highly weather sensitive items. Finally, from the result, we show the possibility of connecting the model to the marketing strategies of real stores at an operation level.
Future works include considering how to decide the proper number of latent classes using measurements such as Akaike Information Criteria (AIC) (Akaike, 1973), expanding the model to consider other weather conditions and quantify weather sensitivity, and studying how best to utilize the results obtained.
APPENDIX A
Here, the derivation of the EM algorithm for the proposed model is described. At first, the following notations are introduced.
The probability of the complete data set $(A,\hspace{0.17em}B,\hspace{0.17em}C,\hspace{0.17em}V)$ is given by
where ${a}_{n}\in \mathcal{X},\hspace{0.17em}{b}_{n}\in \mathcal{S},\hspace{0.17em}{c}_{n}\in \mathbb{R},\hspace{0.17em}{v}_{n}\in \mathcal{Z}.$
Estep:
The data of latent class V cannot be observed; $p(V\text{}A,\hspace{0.17em}B,\hspace{0.17em}C)$ is prepared to calculate the expectation of the log likelihood.
Using this probability, the Qfunction can be formulated as follow:
Here, let $\delta (\alpha ,\hspace{0.17em}\beta )$ be the indicator function that takes 1 if $\alpha =\beta $ and otherwise 0, the following formulas are derived.
Therefore, Qfunction can be formulated as follows:
Mstep:
On the following constraints, the Q function is maximized with respect to the parameters.
The method of the Lagrange multiplier is applied to solve the optimization problem.
(1) Optimization for $p({z}_{k})$: From
we have
Because ${\sum}_{k=1}^{K}p\left({z}_{k}\right)=1$, the equation
is satisfied. Therefore, we have
so that
is given.
(2) Optimization for $p({x}_{j}{z}_{k})\hspace{0.17em}\text{and}\hspace{0.17em}p({s}_{i}{z}_{k})$: From
we have
Because $\sum p\left({x}_{j}{z}_{k}\right)=1$, the equation
is satisfied. Therefore, we have
so that
is given. Similarly, we have
(3) Optimization for μ_{k}:
From
we have
(4) Optimization for ${\sigma}_{k}^{2}$:
From
we have
Yuto Seko was a graduate student at Waseda University in Japan at the time of writing this paper. He received his master's degree in department of Industrial and Management Systems Engineering from Waseda University in 2020. His research interests include machine learning, statistics, and their applications.
Ryotaro Shimizu is now a graduate student of doctoral course at Waseda University, Japan. He received his master's degree in department of Industrial and Management Systems Engineering from Waseda University in 2019. His research interests include machine learning, artificial intelligence, and the applications of advanced analytics in business section.
Gendo Kumoi is a research associate in the department of Industrial and Management Systems Engineering, Waseda University, Japan. He is studying in the field of applied information mathematics, machine learning, and text mining. He is a member of IEEE, Information Processing Society of Japan, and etc.
Tomohiro Yoshikai is now working for Japan Weather Association. He is in charge of data analysis in the field of disaster prevention using weather radar data and product demand forecasting projects using weather information. He is a specialist of data analytics related with weather data and its application to various fields.
Masayuki Goto is a professor in the department of Industrial and Management Systems Engineering, Waseda University, Japan. He received his Dr.E. degree from Waseda University in 2000. He is studying in the field of data science, business analytics, machine learning, and Bayesian statistics. He is now a director of the Research Institute of Data Science, Waseda University. He has won several best paper awards at several conferences such as the 20th Asia Pacific Industrial Engineering and Management Systems (APIEMS 2019) and 16th Asian Network for Quality Congress.