## 1. INTRODUCTION

Recently, various service websites and applications providing restaurant information on the Internet have been widely deployed and used by many users. Such sites on the Internet are called restaurant guides. On a restaurant guide, users can search for restaurants, obtain information about various restaurants, and read recommendation articles posted by other users. Users under IDs post their recommendation articles on restaurants, and these posted articles are a valuable information source for other users. When an open user searches for restaurants on an online restaurant guide, recommendation articles from other users are useful for choosing the restaurant that he/she wants to use. Therefore, recommendation articles are important assets for restaurant guides, and if users post many articles on a restaurant guide and the number of good articles increases, the site user activity increases. In addition, users who post a recommendation article consider positive reactions of other users to the posted article to be a motivation for the next post. In other words, determining what makes an article good is helpful not only for users who post articles, but also for a service management company.

Here, we conduct a case study by focusing on the Japanese restaurant guide, Retty1, which is a famous online service site in Japan. On the target restaurant guide Retty, open users can view many recommendation articles posted by users under their real names and react (e.g., “like” or “I want to go”) to each posted article. As a user posts recommendation articles on Retty using his/her real name, the reliability of the information is relatively high, and recommendation articles are highly valued by general users. In addition, users can choose to “follow” their favorite users like other social networking services (SNS) and also react to each recommendation article, such as “like” and “I want to go.” For users who post recommendation articles, the number of reactions for their posts is an important motivation because it represents the degree of empathy from other users. Therefore, posting users will benefit from guidelines on how to write effective recommendation sentences to increase the number of reactions. For this purpose, the characteristics of good posted articles can be revealed by building a model representing the relationship between the contents of a recommendation article and the number of reactions given by other general users. A recommendation article contains a lot of information such as restaurant information (e.g., category and budget), recommendation sentences, images, and recommendation degree. In particular, recommendation sentences have characteristics that allow users to describe their opinions freely from various viewpoints such as taste, customer service, and the atmosphere inside the restaurant. There is also an aspect that it is easy to increase the number of posts with reference to the analysis result that will be described in Section 4. In this study, we focus on the characteristics of recommendation sentences to determine how to write an effective article, to increase the number of reactions.

It is known that the number of followers of a posting user has a larger influence on the number of reactions than other factors, such as the contents of a recommendation article. Therefore, we propose a method of hierarchical modeling of the relationship between the contents of a recommendation article and the number of reactions given by other general users. In the first step, we build a regression model that predicts the number of reactions by using information such as the number of followers as explanatory variables. Then, we obtain residuals (the difference between the predicted value and the measured value) for each data. In the proposed model, we assume that the residuals deviate from the baseline (i.e., expected number of reactions that the article will get) owing to the effect of text information. In the second step, we construct a latent class model that represents the relationship between the residuals, recommendation sentences, and restaurant. The latent class model (Hagenaars and McCutcheon, 2002) provides a method that enables the analysis of objects containing heterogeneous data. The latent class model is shown to be useful for analysis of free-format text and purchasing history data in many previous studies and is useful for the model discussed in this paper. Thus, the purpose of this study is to develop a model for relational analysis of recommendation articles and reactions by assuming a hierarchical structure and using the latent class model. Finally, we demonstrate an analysis based on the proposed model by using practical data stored on the target restaurant guide Retty. Through the demonstration of the analysis, the effectiveness of the proposed model is confirmed.

## 2. PREPARATION

### 2.1 Restaurant Guides

Restaurant guides are integrated information service available on the Internet where users can search for restaurant information and exchange information about various restaurants. In recent years, the number of consumers who use restaurant guides when choosing restaurants has rapidly increased. In Japan, restaurant guides emerged at the end of the 20^{th} century. Originally, guide-type restaurant information sites introduced various restaurants as the mainstream; however, owing to the spread of the Internet and the development of information technology, this mainstream transformed into a posting systems where users themselves could post recommendation articles about restaurants. As users can freely describe their impressions about restaurants from various viewpoints, the amount of information is larger than that on guide-type restaurant information sites. A huge amount of posted information helps many other users with restaurant selection.

Next, we explain the restaurant guide Retty, which is the target site in this study. Retty is one of the most famous online restaurant guides in Japan and provides users the function of posting recommendation articles in addition to restaurant search. A recommendation article contains a recommending degree (three levels of excellent/ good/average), recommending sentences, images, time zone (morning/lunch/dinner), etc. In addition, it has the SNS function. First, a user registers a system account using his/her real name linked with Facebook. In addition, to find a restaurant from helpful users, users can “follow” their favorite users and react (e.g. “like” or “I want to go”) to recommendation articles by other users.

### 2.2 Related Works

Several studies have been conducted on the analysis of data stored on restaurant guides. For instance, Pantelidis (2010) analyzed factors increasing the degree of restaurant recommendations in restaurant guides and indicated the restaurant elements to be improved. Kang *et al*. (2012) focused on sentimental words of articles by labeling them negative or positive manually and by classifying the labels using a machine learning method. Mochizuki *et al*. (2013) proposed a method for translating a recommendation article using paraphrase based on the experience of the user who posted the article into an expression that is easy to be transmitted to a receiving user. As another perspective, Zhang *et al*. (2013) focused on the reservation behavior on a restaurant guide and analyzed factors leading to reservation. As mentioned above, several studies have been conducted from various viewpoints on data stored on restaurant guides. However, studies focusing on recommendation articles themselves and their relationship with reactions from other users have not been conducted.

### 2.2 Latent Class Model

Latent class models (Bishop, 2006;Hagenaars and McCutcheon, 2002;Magidson and Vermunt, 2002) assume the existence of an unobservable discrete latent variable behind observed variables. The assumption of latent variables makes it possible to analyze realistic complex problems such as mixed heterogeneous data. In other words, latent class models assume that the whole data are an aggregate in which groups with different characteristics are mixed. Latent class models can be extended by incorporating a hypothesized probability distribution and features of the considered event into the models according to the target event and data structure. They provide a method that enables the analysis of objects containing heterogeneous data. These models have been shown to be useful for analysis of text data (Blei *et al*., 2003;Hofmann, 1999;Yamamoto *et al*., 2017), collaborative filtering (Hofmann, 2004;Jin *et al*., 2003, 2006;Suzuki *et al*., 2014;Si and Jin, 2003), and analysis of marketing data (Goto *et al*., 2015;Green *et al*., 1976;Iwata *et al*., 2009;Swait and Adnmowicz, 2001;Train, 2009) in many previous researches. In these models, data are generated by following a mixture model of several different probability distributions. This model structure fits real data consisting of heterogeneous subgroups excellently.

Recently, many types of latent class models have been proposed. In this section, we introduce some wellknown latent class models and their applications. The unigram mixture model (Nigam *et al*., 2000) is a wellknown basic document model that assumes that all terms in a document belong to a latent class; that is, all terms in the same document are generated on the same topic. Probabilistic latent allocation (PLSA) is also a wellknown latent class model that assumes that data occurs probabilistically from latent classes (Hofmann, 1999). In particular, when we handle documents and words, these models are also called topic models, and studies such as Latent Direchlet Allocation (LDA) proposed by Blei *et al*. (2003) are being conducted. In addition, latent class models are widely applied to various fields such as collaborative filtering (Hofmann, 2004) and purchasing history analysis (Iwata, 2009;Goto, 2015).

## 3. PROPOSED METHOD

### 3.1 Overview

In this study, we propose a latent class model for the relational analysis between recommendation articles and the number of reactions. In particular, we focus on the influence of recommendation sentences on reactions of other users.

Here, we explain the characteristics of the proposed model. The first characteristic is as follows: when the number of reactions is set as a response variable of the model, it is found that the influence of basic posting information, such as the number of followers of each user and the number of images of each article, is larger than the influence of sentences. Therefore, a risk that regression analysis will not be able to determine the influence of text information if the information that is likely to affect reactions and the text information are handled in parallel and are included in explanatory variables simultaneously exists.

Hence, we propose a hierarchical model in this study. First, we develop a regression model that predicts the number of reactions by variables of basic information such as the number of followers. Then, we obtain residuals (the difference between the predicted value and the measured value) of the regression model. Here, the residual is regarded as a value excluding the influence of basic information, and it is assumed that the residuals deviate from the baseline owing to the effect of text information. In the second step, we make a cluster determined by the value of residuals, recommendation sentences, and restaurant by using the latent class model. Note that many types of sentences (words) and restaurants are difficult to express using a single relationship. Therefore, to learn while grouping them automatically, we build a model assum-ing a latent class. Figure 1 shows a conceptual image of this approach.

When we use the whole data for learning a regres-sion function, the estimated model can overfit the data, and the residual can be underestimated. In other words, the estimated residual is not suitable for analysis based on the latent class model; therefore, we divide the data into two sets. We use the one set to learn the regression function and another to learn the latent class model by applying the learned function and to calculate the re-sidual for the data.

The procedure is as follows. First, we divide the whole data into the data for learning the regression function and the data for learning the latent class model. Next, we estimate the regression function *F*, apply the function to another data, and calculate each residual. Then, we build a latent class model using the infor-mation on the residual, text, and restaurant. The proce-dure is presented in Figure 2.

### 3.2 Regression using Basic Information (Step 1)

In the first step (Step 1), we make a predictive model that expresses the relationship between the number of reactions and the basic information about the article (e.g., the number of followers). We develop a regression model by using the number of reactions as the responsive variable and the basic information as explanatory variables. For each recommendation posted by a user, the explanatory variable vector with *D* pieces of the basic information on an article is denoted by $x={({x}_{1},\hspace{0.17em}{x}_{2},\hspace{0.17em}\dots ,\hspace{0.17em}{x}_{D})}^{\text{T}}$, and the number of reactions to the article is denoted by *y*. By using an arbitrary prediction function *F*, Equations (1) and (2) lead to the prediction value $\widehat{y}$ and the residual *r* of the article.

Here, we interpret $\widehat{y}$ as a baseline for the number of reactions. The residual $\widehat{y}$, which cannot be explained by the basic information, is assumed to contain the effect of recommendation sentences and recommended restaurants.

### 3.3 Modeling of the Residual Value and Article (Step 2)

In the next step (Step 2), we model the relationship between restaurants, sentences, and residuals obtained using the regression in Step 1. Owing to the diversity of sentences and restaurants, we introduce a latent class model.

#### 3.3.1 Formulation of the Model

First, we introduce notations to formulate our model. The vocabulary of words used for the analysis is denoted by $\mathcal{V}=\left\{{w}_{i}\text{|}1\le i\le I\right\}$ and the restaurant set is denoted by $\mathcal{S}=\left\{{s}_{j}\text{|}1\le j\le J\right\}$. The text information vector of a document is defined as $d=({d}^{{w}_{1}},\hspace{0.17em}{d}^{{w}_{2}},\hspace{0.17em}\dots ,\hspace{0.17em}{d}^{{w}_{I}})$, where ${d}^{{w}_{i}}$ is a binary variable. If the word *w _{i}* appears in an article, ${d}^{{w}_{i}}=1$; otherwise, ${d}^{{w}_{i}}=0$. In addition, an unobserved latent class

*z*is denoted by ${z}_{k}\in \mathcal{Z}$, where $\mathcal{Z}=\left\{{z}_{1},\hspace{0.17em}{z}_{2},\hspace{0.17em}\dots ,\hspace{0.17em}{z}_{K}\right\}$ is the set of latent classes.

_{k}Now, we focus on a recommendation article. Here, the co-occurrence of the residual, restaurant of the article, and text information is denoted by $P(r,\hspace{0.17em}{s}_{j},\hspace{0.17em}d)$. Then, the probability model $P(r,\hspace{0.17em}{s}_{j},\hspace{0.17em}d)$ is formulated as

where

Regarding the residual value, a normal distribution is assumed, and *μ _{k}* and ${\sigma}_{k}^{2}$ represent the average and variance of the normal distribution in the latent class

*z*, respectively. $P({s}_{j}\text{|}{z}_{k})$ is a multinomial distribution that represents the probability that a user posts a recommendation article on restaurant.

_{k}*s*under class

_{j}*z*. The text vector

_{k}*d*is calculated as the product of all conditional binomial probabilities of words

*w*, where $P({w}_{i}\text{|}{z}_{k})$ is the probability of word

_{i}*w*to appear in the class

_{i}*z*and $P({\overline{w}}_{{}_{i}}\text{|}{z}_{k})$ is the probability of word

_{k}*w*to not appear. That is, $P({w}_{i}\text{|}{z}_{k})+P({\overline{w}}_{{}_{i}}\text{|}{z}_{k})=1$ is satisfied.

_{i}Figure 3 shows the graphical model used in Step 2 of the proposed method.

#### 3.3.2 Learning Parameters using the Expectation-Maximization Algorithm

The parameters in the proposed model *P*(*z _{k}*), $P({s}_{j}\text{|}{z}_{k}),\hspace{0.17em}P({w}_{i}\text{|}{z}_{k}),\hspace{0.17em}{\mu}_{k}$, and ${\sigma}_{k}^{2}$ are estimated using the Expectation-Maximization Algorithm (EM algorithm). The EM algorithm (Dempster

*et al*., 1977;McLachlan and Krishnan, 2007) is a method of estimating parameters using an iterative procedure by locally maximizing the likelihood when the probability model depends on non-observable variables. The EM algorithm consists of two steps: the expectation step (E-step) and the maximization step (M-step), and iterates these steps until the logarithmic-likelihood function

*LL*converges.

Here, in the recommendation article *n*, the basic information of the article is defined as * x_{n}*, residual

*r*of the regression model

_{n}*F*for the reaction number

*y*is denoted by ${y}_{n}-{\widehat{y}}_{n}$, recommended restaurant is defined as ${a}_{n}\in \mathcal{S}$, and text vector is denoted by

_{n}*. Then, the logarithmic-likelihood of the given data is described as follows:*

**d**_{n}

First, the E-step of the EM algorithm is formulated below:

【E-step】

Then, based on Jensen’s inequality, we introduce a function *LL*′ that is always smaller than *LL*.

where $\delta (C)$ is an indicator function. If the argument is true, $\delta (C)$ returns 1; otherwise, returns 0. The parame-ters $P({z}_{k}),\hspace{0.17em}P({s}_{j}\text{|}{z}_{k}),\hspace{0.17em}P({w}_{i}\text{|}{z}_{k}),\hspace{0.17em}{\mu}_{k}$, and ${\sigma}_{k}^{2}$ are estimated such that the value *LL*′ is maximized and each con-struction is satisfied.

Let $\eta ,\hspace{0.17em}{\iota}_{k}$ and ${\kappa}_{k,i}$ be the Lagrangian undetermined multi-pliers. We define the Lagrangian function *g* as follows:

Then, let the value of the differentiation of *g* by $P({z}_{k}),\hspace{0.17em}P({s}_{j}\text{|}{z}_{k}),\hspace{0.17em}P({w}_{i}\text{|}{z}_{k}),\hspace{0.17em}{\mu}_{k}$, and ${\sigma}_{k}^{2}$ be 0. We update the estimations in the M-step as follows.

【M-step】

The E-step and M-step are repeated until the logarithmic-likelihood function *LL* converges, and then we obtain the estimations of $P({z}_{k}),\hspace{0.17em}P({s}_{j}\text{|}{z}_{k}),\hspace{0.17em}P({w}_{i}\text{|}{z}_{k}),\hspace{0.17em}{\mu}_{k},$ and ${\sigma}_{k}^{2}$.

## 4. DATA ANALYSIS FOR REAL DATA

To verify the effectiveness of our proposed model, we demonstrate the analysis of practical data of the restaurant guide Retty.

### 4.1 Data Set and Analysis Conditions

In this analysis, we analyze recommendation arti-cle data, restaurant data, and user data stored on Retty in March and April 2016. We restricted the target data only to articles that were public as of July 2017 with more than 50 letters. Approximately 60,000 articles were covered in each month.

First, in Step 1, we construct a function *F* that predicts the number of reactions from the basic infor-mation of the recommendation article by using the data of March. As the basic information, we used three vari-ables of “Recommending Degree,” “Number of Images,” and “Number of Followers”. As a prediction model *F*, we use the random forest regressor (Breiman, 2001) because of its good predictive performance and usability.

Next, we apply the model *F* to the April data and calculate the predicted value of the number of reactions. Then, we obtain the residuals. In Step 2, we learn the proposed latent class model using these residual values, restaurant, and recommendation sentences. Now, the target words consist of nouns, verbs, and adjectives that appeared in more than 30 recommendation articles in April, and the vocabulary size *I* was 6,513. In addition, we used a morphological analysis tool2 and used a dic-tionary that has the advantage of compound words and new words . In addition, instead of individual restau-rants, we use the category of each restaurant defined by the service operating company, and the number of cat-egories of restaurants is *J* = 213. Furthermore, based on the previous experiments, the number of latent classes was set to 14 by using Akaike Information Criterion (AIC).

### 4.2 Result of Step 1

Table 1 shows the prediction result of the root mean squared error in Step 1. In addition, Table 2 shows the normalized feature importance calculated by using the random forest algorithm.

From Table 2, we can see that the number of fol-lowers affects the number of reactions. The recommen-dation degree has a small impact on the number of re-actions. The reason is considered as follows: the charac-teristic of the target restaurant guide is that the score of each article tends to be high. Moreover, although the recommendation degree is low, the policy introducing recommended restaurants suggests that the recommen-dation degree does not affect significantly of increasing the number of reactions.

### 4.3 Result of Step 2

Table 3 lists the learned parameters. For restaurants and words, we introduce types by examining the class membership probabilities $P({z}_{k}\text{|}{s}_{j})\hspace{0.17em}\text{and}\hspace{0.17em}P({z}_{k}\text{|}{w}_{i})$. For example, if words with the highest scores are pizza, spa-ghetti, etc., then the word type is “food” To confirm the estimated parameters, we conducted a statistical test of the difference of the estimated means *μ _{z}* between the latent classes. We assumed the result of each class to be the result of one-way ANOVA having 14 levels. The test showed a statistical significance

*μ*between the levels based on the variance analysis.

_{z}From Table 3, the probability of occurrence of the latent class *z _{k}* is almost unbiased, and it turns out that there are no classes that have much data. The average values of the residuals

*μ*differ between the classes. Moreover, by studying the relationship between the change in the values of the residuals, restaurants, and words, we can find some tendencies.

_{k}First, in the classes *z*_{1} and *z*_{2}, where the average residuals are relatively high, characteristic words are not of the “food” type; therefore, words are considered to represent other factors (appearance inside the restau-rant, situation, etc.). On the other hand, in the classes from *z*_{11} to *z*_{13}, where the average residuals are rela-tively low, characteristic words are of the “foods” type. Thus, it is suggested that users wrote better not only about foods, but also about other elements, leading to a better recommendation article.

Next, we focus on the latent classes *z*_{7}, *z*_{10} and *z*_{14}. The characteristic words of these classes are not associated with restaurants at a glance. By investigating the articles in which these words appear, it was con-firmed that the title of the personal blog is within the post. In other words, some users copy the contents writ-ten in their personal blog and paste them on the target restaurant guide site as a recommendation article. Con-sidering the fact that the mean parameters *μ _{k}* of these classes

*z*

_{7},

*z*

_{10}and

*z*

_{14}are negative, it can be pointed out that articles containing parts from personal blogs are not preferred in terms of the reactions (i.e., empa-thy).

Note that there are some classes where the interpre-tation of the restaurant type and the word type are the same, but the value *μ _{z}* of one class is positive and another is negative (e.g.,

*z*

_{3}and

*z*

_{10}). This is because restaurants can have different characteristics even if they belong to the same category. In this study, we ap-plied the common regression model to all data and gen-erated the latent classes by applying the PLSA model to the residuals of each data, restaurant, and text infor-mation. Articles with many reactions and articles with a small number of reactions have different statistical characteristics despite of belonging to the same catego-ry or having the same text content, the statistical char-acteristics of these restaurants are different, although they belong to the same category. In this case, multiple latent classes can be constructed for the same restau-rant type and word type, but each value of the residual is different.

Here, we focus on the category of each restaurant. Overall, it appears easier for ordinary restaurants to obtain reactions. Than for restaurants that are some-what expensive or unfamiliar.

As described above, the proposed model enables the analysis of the influence of the recommendation articles on the reactions of other users and new knowledge acquirement from the learning results.

## 5. DISCUSSIONS

### 5.1 Approach for the Model Construction

In our approach, the learning procedure for analyz-ing the relationship between the posted articles and the number of reactions is divided into two steps: the least square estimation for the regression model and the EM algorithm for the latent class model. Moreover, we adopted the data division for learning both the regres-sion model and latent class model. In terms of the re-gression model, the model assumed implicitly in this study can be described as follows:

Here, *x _{j}* are the explanatory variables, $F({x}_{1},\hspace{0.17em}{x}_{2},\hspace{0.17em}\dots ,\hspace{0.17em}{x}_{p})$ is an appropriate regression model,

*ε*is the error term $\epsilon ~N(0,\hspace{0.17em}{\sigma}_{z}^{2})$, and the effect

*ξ*is the term depending on the text information of the recommendation articles. In this assumption, the average of the error is 0 for every class; however, the variance of the error ${\sigma}_{z}^{2}$ is different depending on the latent class

_{z}*z*, and

*ξ*is the effect of the latent class

_{z}*z*that is determined by the corre-sponding recommendation article of a restaurant. The residual of the first step of the regression analysis is an estimator of ${\xi}_{z}+\epsilon ,$ which includes the error term

*ε*. We assume that if the term

*ξ*is positive, it is a good rec-ommendation article, and if the term

_{z}*ξ*is negative, the article is considered to have no more reactions than the baseline. Although we cannot know the true values of the term

_{z}*ξ*, it can be assumed that the dual is a good estimator when the error term

_{z}*ε*has a zero mean and a relatively small variance.

Then, the distribution of the estimated error ${\widehat{\epsilon}}_{i}$ for a statistical model validation including a residual diag-nostic of the error *ε* was determined. We discussed the adequacy of the estimated model using the residual plot. Here, in the proposed model, each data is considered to belong to each class probabilistically; however, as the residual plot is complicated, we classified each sample data into the most likely latent class. That is, we applied the hard-clustering approach to the data and checked the distribution of the estimated errors for each latent class. We presented histograms of the estimated errors for each class by calculating the residuals using Equa-tion (18). The results of each class mentioned in 4.3 ${z}_{1},\hspace{0.17em}{z}_{2},\hspace{0.17em}{z}_{7},\hspace{0.17em}{z}_{10},\hspace{0.17em}{z}_{11},\hspace{0.17em}{z}_{12},{z}_{13},\hspace{0.17em}\text{and}\hspace{0.17em}{z}_{14}$) are shown in Figurs 4-11.

As shown in the results, none of the residual distri-bution shapes are extremely asymmetrical, and we cannot observe abnormalities. Therefore, we can sug-gest the adequacy of the result. From these figures, the distributions are unimodal. It seems reasonable to as-sume normal distributions for the error distributions for each latent class.

Here, we discuss the model learning approach. Considering the model construction of the number of reactions, a straightforward approach is to use the basic information and the information of texts and restau-rants together. In other words, the fundamental ap-proach is to analyze the data based on only one regres-sion model using all variables as explanatory variables. However, the number of types of words in the texts is relatively large when considering the number of learning data. In fact, the number of texts is about 60,000, and the number of the types of words is 6,500. Therefore, if we analyze the data based on only one regression mod-el with 6,500 explanatory variables, then the number of parameters becomes large and the model estimation may be unstable. Therefore, the approach of analyzing the data using the regression model and the latent class model should be appropriate.

Next, we discuss the division of the data into two parts, as shown in Figure 2. The usual approach is to learn the regression model using the whole data; howev-er, the overfitting problem can occur for the residuals. That is, the residual may be underestimated. Adding a regularization term such as Lasso and Ridge (Zou and Hastie, 2005) is one of the solutions of the overfitting problem. When we adopt the regularization approach, we need to search for the appropriate value of the regu-larization parameter. In general, approaches such as cross-validation (Zou and Hastie, 2005) can be adopt-ed; however, the computational amount is enormous. In addition, the estimation accuracy of residuals is more important than the parameter estimation accuracy in the case of this study.

Therefore, dividing the data into two parts (i.e., the data for learning the regression analysis and the data for learning the latent class model) can be considered the simplest and most efficient. When estimating the residuals of the data that are not used for the parameter estimation, these are not overestimated and are useful for the next step of learning the latent class model.

### 5.2 Application of the Latent Class Model

As shown in Table 3, there are several latent classes where the estimated value of the residual ${\sigma}_{k}^{2}$ is large. As the model handles the occurrence probability and residuals equally, as in Figure 3, it does not estimate the residual intensively. As the dimension of the document vector is relatively large, the fitting to a document vec-tor tends to be emphasized. Owing to the maximum likelihood estimation, latent classes with large differ-ences in residuals are not obtained.

The approach proposed by Sakamoto *et al*. (2017) can be efficient for solving the problem above. By ap-plying this approach, the E-step of the EM algorithm is formulated as follows:

Here, $\alpha ,\hspace{0.17em}\hspace{0.17em}\beta $ and *γ* are decided in advance. However, the decision of the parameters is not easy; therefore, an analysis method for the efficient parame-ter setting is required for real data analysis.

### 5.3 Analysis of the Text Data with other Variables

In this study, we analyzed text data using restau-rant data. Many studies have analyzed not only text data, but also other variables. For example, Akita *et al*. (2016) predicted the stock price by considering not only information about the stocks, but also text data. In ad-dition, Park *et al*. (2016) analyzed the motivation of cruising based on Twitter data and cruising information. Our study can be regarded as one instance of such stud-ies analyzing text information along with other varia-bles.

## 6. CONCLUSION AND FUTURE WORK

In this study, we used the number of reactions from other users as an indicator of “goodness” in order to determine what makes a recommendation article good on a restaurant guide, and proposed a model for relational analysis of articles and reactions. We modeled basic information, sentences, and restaurants hierarchi-cally, and analyzed the relationship by using the latent class model.

In the application to actual data in Step 1, the re-sults showed that the influence of the number of fol-lowers on the number of reactions was large. Further-more, in Step 2, we found that the number of reactions differed from the baseline for different types of words and restaurants, and captured trends. Several methods to increase the reactions could be pointed out from the results of the analysis. For example, it is better to write not only about food, but also about other factors. From the above, we demonstrated the effectiveness of the proposed method.

For the future work, it is necessary to verify and improve the estimation accuracy of the number of reac-tions. In this study, the estimated parameters are con-vincing; however, the prediction accuracy is not as good. That is, the variance of the residuals is still large in the parameter estimation. If this variance can be reduced, the reliability of the analysis result will increase.