1. INTRODUCTION
One of the most interesting and challenging issues in the field of finance at the moment is highfrequency trading (HFT). This is because HFT is a relatively new issue for most of the financial markets and its activity is becoming more common all over the world, reaching more than half of equity trading in the United States by 2010 (Menkveld and Yueshen, 2013).
HFT is not a strategy, but a technology which allows for the automation of a wide spectrum of trading strategies, propelled by the ongoing advances in computer technology. The HFT algorithms create probabilistic predictions in the price movements of securities and show the possibility to earn a profit from small price movements by holding a certain trading volume for a specific small amount of time (Ecaterina Bencheci and Rene Botvin, 2014).
HF traders create very tiny profits per stock, but according to the large value of the daily trading volume, the obtained profits are significantly more remarkable. HF traders benefit from their physical proximity to the exchange platforms, to minimize the latency and usually end their trading day with a “flat” position (Chordia et al., 2013).
Highfrequency traders (HFTs) have become a potent force in many markets, representing between 40 and 70% of the trading volume in US futures and equity markets, and slightly less in European, Canadian and Australian markets. The potential profit from every stock resulting from an execution may be very tiny and be achieved a probability only slightly above 50%, but HFTs rely on this process being repeated thousands, if not more, times a day. As the law of large numbers and the central limit theorem relentlessly take their hold, profits ensue and presumably justify the HFTs’ large investment in trading technology (AitSahalia and Saglam, 2017).
HFT condition their strategies on order book depth imbalances, which are a strong predictor of future price change. Investigating the order book imbalance immediately before each order submission, cancelation and trade, it shows HFT supply liquidity on the thick side of the order book and demand liquidity from the thin side. This strategic behavior is more pronounced during volatile periods and when trading speeds increase. However, by competing with nonHFT limit orders, HFT crowd out nonHFT limit orders.
This study aimed to investigate the performance of an order imbalance based trading strategy in a highfrequency trading. The study used the empirical literature on order imbalance effects in stock returns for different reasons.
First, there is no literature on imbalance–return relations for Iranian stocks. One of the differences of the Iranian data over most other country data is that all trades are determined as either buyer or seller, so it preventing errors from the use of trade classification algorithms.
Second, most of the literature uses time series regressions especially a tiny period like second, minutes and day, so in this study, we used a regression for the large period interval.
Third, it seems there is no literature on a recent sample of order imbalances because stock markets worldwide become more efficient, and it is more interesting whether effects reported for the last decades still is efficient at daily frequencies. So, in this study, we provide results for recent daytoday effects. Fourth, we reported liquidity effects in the imbalance and return relation. Fifth, similar to literature, we provide strong evidence that order imbalances predict shortterm future price movements. Finally, in contrast to the previous literature, we find imbalance effects to be weaker for very high levels of the order imbalance.
2. LITERATURE OF REVIEW
Highfrequency trading has been vastly studied by many scholars, however, there is a long list of questions that are still open and require further research.
AitSahalia and Saglam (2017) in a study analyze the consequences for liquidity provision of competing for market makers operating at high frequency. Also, the results show that these policies are largely unable to induce highfrequency market makers to create liquidity that is robust across volatility issues (AitSahalia and Saglam, 2017).
Goldstein (2017) examining the order book imbalance immediately before each order submission, cancelation and trade, we show highfrequency traders (HFT) supply liquidity on the thick side of the order book and demand liquidity from the thin side. This strategic behavior is more pronounced during volatile periods and when trading speeds increase. However, by competing with nonHFT limit orders, HFT impose a welfare externality by crowding out slower nonHFT limit orders. Overall, we document an important information channel driving HFT behavior (Goldstein et al., 2017).
Stav (2015) in a research entitled ‘HighFrequency Trade Direction Prediction’ declared that highfrequency trading involves large volumes and rapid price changes. The obtained results of the study compared with the previous literature in the highfrequency context. Some previous literature shows that idiosyncratic risk has an important on lowfrequency trading, but has not yet investigated its effects on highfrequency trading (Stav, 2015).
Lehalle and Mounjid (2016) emphasize the exposure to adverse selection, of paramount importance for limit orders. For a participant buying using a limit order: if the price has chances to go down the probability to be filled is high but it is better to wait a little more before the trade to obtain a better price. To the authors’ knowledge, this paper is the first to make the connection between empirical evidence, a stochastic framework for limit orders including adverse selection, and the cost of latency. Our work is the first step to shed light on the roles of latency and adverse selection for limit order placement, within an accurate stochastic control framework.
Subrahmanyam and Zheng (2016) using a unique dataset consisting of limit order placement, execution, and cancellations on Nasdaq, we find that HFT firms do not cancel orders more frequently than nonHFT firms. HFT firms more effectively use order cancellation to strategically manage their limit orders in anticipation of shortterm price movements than nonHFT firms. HFT firms increase their liquidity provision during periods of high volatility; their liquidity provision is less affected by order imbalance shocks than that of nonHFT firms. Overall, our results indicate that HFT limit orders exert a stabilizing influence on markets (Subrahmanyam and Zheng, 2016).
Bonart and Gould (2016) use a recent, highquality data set from Nasdaq to perform an empirical analysis of order flow in a limit order book (LOB) before and after the arrival of a market order. For each of the stocks that we study, they identify a sequence of distinct phases across which the net flow of orders differs considerably. Based on our findings, we argue that strategic liquidity providers consider both adverse selections and expected to wait for costs when deciding how to act (Bonart and Gould, 2016).
Dinh (2016) in a research investigate the relationship between returns, risk, and liquidity in highfrequency trading. Panel analysis for single stocks is employed to investigate this relationship. The empirical results imply that in highfrequency trading idiosyncratic risk plays a more pronounced role than the systematic risk in asset pricing. The empirical results of the paper contribute to the previous literature in the highfrequency context. Some previous literature suggests that idiosyncratic risk has a matter on lowfrequency trading, but has not yet investigated its effects on highfrequency trading. Highfrequency traders are considered to be market agents that base their trading activity on information about prices and order flow. They usually trade in opposition to price pressure (Dinh, 2016).
Brogaard et al. (2016) examine the stability of liquidity supply by highfrequency traders, who do not have the obligation to supply liquidity during stressful periods. They find that HFTs supply liquidity to nonHFTs during extreme price moves in a single security but demand liquidity when several stocks experience simultaneous extreme price moves. Thus, HFT may be supplying liquidity on Nasdaq while demanding liquidity from other trading venues. By analyzing a mostly consolidated market, we provide further insights into HFT trading activity over the whole market (Brogaard et al., 2016).
Multivariate GARCH (MGARCH)type models used in previous studies include the BEKK MGARCH (Willcocks, 2010; Miao et al., 2011) and the Dynamic Conditional Correlation (DCC) model (Antonakakis et al., 2015).
Furthermore, Benghazi et al. (2016) used DCC and BEKK GARCH model to test volatility spillover among global Real Estate Investment Trusts (REITs) and found that the REIT market is becoming increasingly globalized.
Gardebroek and Hernandez (2012) followed a multivariate GARCH model to evaluate the level of interdependence and the dynamics of volatility across oil, ethanol and corn markets. Their results indicate a higher interaction between ethanol and corn markets in recent years, particularly after 2006. The authors did not find major crossvolatility effects from oil to corn markets. The results did not provide evidence of volatility in energy markets stimulating price volatility in grain markets (Gardebroek and Hernandez, 2012).
Ntakaris et al. (2017) declared managing prediction of metrics in highfrequency financial markets as a challenging task. An efficient method to do it is by control the relationship of a limit order book and attempted to determine a data edge. Hence, they determine an experimental protocol that can be used to evaluate the performance of related research methods. Baseline results based on linear and nonlinear regression models are also provided and show the potential that these methods have for midprice prediction (Ntakaris et al., 2017).
Shen (2015) in a research examining order imbalance, a measure of the difference in size of buy and sell orders in the market, a simple trading strategy by fitting a linear model using ordinary least squares against a 20 timestep (10 seconds) average price change developed. Lastly, we determined a confidence interval for the optimal regression and trading parameters: the forecast window for the average price change and the trading threshold and found that they were closer to 5 and 0.15 respectively (Shen, 2015).
In contrast to above work by Shen (2015) and Ntakaris et al. (2017) which describe the relation between exchange rate changes and order imbalance by regression model, we propose a GARCH model which can capture the timevariant property of the relation.
Cartea et al. (2018) use highfrequency data from the Nasdaq exchange to build a measure of volume imbalance in the limit order (LO) book. They show that our measure is a good predictor of the sign of the next market order (MO), i.e., buy or sell, and also helps to predict price changes immediately after the arrival of an MO. They show that introducing our volume imbalance measure into the optimization problem considerably boosts the profits of the strategy. Profits increase because employing our imbalance measure reduces adverse selection costs and positions LOs in the book to take advantage of favorable price movements.
Chen et al. (2019) examines shortrun exchange rate dynamics in a small open economy, Taiwan, based on the microstructure framework of foreign exchange markets. This study develops a contrarian imbalancebased trading strategy given the negative interaction between lagged order imbalances and current returns. They find that imbalancebased strategy with large order imbalance consistently outperforms the benchmark, and an asymmetry trading performance in the currency appreciations versus depreciations period (Chen et al., 2019).
3. DATA
Our dataset includes stocks traded on the Tehran Stock Exchange from April 1, 2014, until March 30, 2016 (1095 trading days). For all stocks, the last available quotes before the closing auction together with order imbalances are available on a daily basis.
Any private data will reflect on the stock price efficiently, and thus eliminate the differential between share price and intrinsic value. Such characteristic can help improve the reliability of research results. The transaction data source in the Tehran Stock Exchange. The sample period covers from April 1, 2014, until March 30, 2016. Stock are included or excluded in term of following criteria:

2. A trade or quote is excluded if it is recorded before the open or after the closing time. (i.e. intraday data is collected from 9:00 AM to 14:00 PM.)

4. Any quote less than 5 seconds prior to the trades is ignored and the first one at least 5 seconds prior to the trade is retained.
3.1 Sample Selection
The statistical population includes all companies accepted in Tehran Stock Exchange from 2014 to 2016. The final sample size is determined by the screening method after applying the following constraints:

2) At least from the year 2014 are accepted to the exchange and will be active until the end of the research period.
4. METHODOLOGY
A trade is classified as buyerseller initiated, so if it is near to ask bid of the prevailing quote. Any quote less than 5 seconds before the trades is excluded and the first one at least 5 seconds before the trade is included. So if the trade is exactly at the midpoint of the quote, a “tick test” classifies the trade as buyerseller initiated if the last price change before the trade is positive or negative.
4.1 Variables
4.1.1 Order Imbalance
There are three important methods on the literature for calculating order imbalance: 1) it is according to the number of buys and sell orders, 2) it uses the size of orders (i.e., the number of shares in each order), and 3) last methods is the current share price by multiplying it with the order size. Many researchers for calculating order imbalance uses the first one, sometimes integrated with the second method. According to Ravi and Sha (2014) observed a significant relationship between returns and order imbalance when the latter is calculated using the number measure approach.
So, we calculated the order imbalance for stock i on the given period t as
${M}_{i,t}^{buyer}$ refer to number of buyerinitiated trades in the given period, ${M}_{i,t}^{seller}$ is number of sellerinitiated trades in the given period and ${M}_{i,t}^{total}$ is the total number of trades for stock i in the given timeperiod t at the frequency {5 sec, 10 sec, 30 sec, 60 sec, 5 min, 10min, 30 min, 1 h, 2 h, 1 day, 2 day, 1 week} (Rubisov, 2015).
4.1.2 Returns
We calculate daily log returns based on the last midquotes before the closing auction:
Hence, ask_{i,t} is the last ask quote for stock i before the closing auction of given period t and bid_{i,t} is the corresponding bid quote. Using midquotes instead of traded prices prevents any bidask effects, which would induce negative firstorder autocorrelation in returns (Rubisov, 2015). In section 2 (review of literature), review on literature shows that there is a large number of papers investigating the relation between order imbalances and returns. In order to examine intraday timevarying relations between return and order imbalance, we employ a GARCH model. The GARCH model is an extended form to the ARCH model and includes in addition to lagged squared error terms also lags of the conditional variance in the model, which gives it the virtue that the number of parameters required to model persistence in volatility, is reduced.
The GARCH model developed in a directional effect of price change on conditional variance. An important benefit of the model is that it can distinguish between positive and negative returns and finally obtained potential asymmetry in volatility due to the direction of the returns. In fact, it has a better fit than the symmetric GARCH model for almost all financial assets (Alexander, 2009).
In order to calculate volatility, return and order imbalance relations, we employ a GARCH model:
where
The multiple regression models are presented below to calculate lagged returnorder imbalance relations:
Hence: R_{t} is the current stock return at given period t of the sample stock
$O{I}_{ti},\hspace{0.17em}j=1,\hspace{0.17em}2,\hspace{0.17em}3,\hspace{0.17em}4,\hspace{0.17em}5$ are the lagged order imbalance variable (at given period t1, t2, t3, t4, t5) of the sample stock
$a+bti,\hspace{0.17em}\hspace{0.17em}j=1,\hspace{0.17em}2,\hspace{0.17em}3,\hspace{0.17em}4,\hspace{0.17em}5$ are the intercept and coefficients of the lagged order imbalance variable
ε_{t} is the residual of the stock return in given period t
According to Chordia et al. (2002), multiple regression models are employed to examine the relationship between stock return and lagged order imbalance.
The model could create the potential predictability in stock return. So that, if the relationship between stocks return and lagged order imbalance can be determined, the lagged imbalance can be used to provide an imbalancebased trading strategy.
In order to determine the relation between market capitalization and order imbalance impact, we use size effect regression model:
Hence Coefficientk is the coefficient describing the effect of “Order Imbalance” on the return of stock k
Cap k is the market capitalization of stock k
εk is the residual of stock k
α, β are coefficients
The financial reality shows that a price and return movement in one market can spread very quickly to another market, i.e. financial markets are interrelated. Consequently, a set of multivariate GARCH type models have been specified to test for the covariances between the asset returns over time: the VECH model (Soenen and Hennigar, 1988); the BEKK model (Lawal and Ijirshar, 2015) and DCC (Griffin and Stulz, 2001); constant and dynamic conditional correlations, respectively.
In our case, we use the diagonal BEKK model to test for the volatility transmission between oil and food markets. The BEKK model has the form Bartov and Bodnar, (1994):
where A_{kj}, B_{kj} and C are N × N parameter matrices and C is triangular. Ht = [hijt] is the conditional covariance matrix of r_{t} and r_{t} is a N × 1 stochastic vector process. q and p are ARCH and GARCH orders.
The several parameterizations that contain the above form make the estimation of the model more difficult. Thus, Lawal and Ijirshar (2015) give conditions for eliminating redundant, observationally equivalent representations.
Since we have only two variables and in order to restrict the number of parameters and simplify their interpretation, we use the diagonal form of the BEKK model as shown in the above matrix form. The estimated parameters of the own lagged innovations quantify the effects of “news “on the variances (ARCH effects), while the parameters of the lagged variances measure the extent of volatility clustering (GARCH effects) and thus reveal the persistence of volatility. This paper estimates the following three variance and covariance equations:
The conditional covariance matrix H_{t} in MGARCH model is estimated using quasimaximum likelihood (QML) by maximizing the Gaussian loglikelihood function. The time series treated in MGARCHBEKK should be stationary and the distribution of its residual is predefined as a conditional Gaussian distribution (normal).
The first step in calculating a DCC model is to obtain conditional correlations from the covariance matrix Q_{t}, which is typically estimated with a GARCH equation governed by two scalar parameters a and b
where Q_{0} is the unconditional covariance matrix. The matrix Q_{t} does not replace H_{t}; its sole purpose is to provide conditional correlations
The H_{t} matrix is created by fitting univariate GARCH models to calculate the variances, and combining these variances with
to calculate the covariances. The process is obtained as
When we calculate the DCC, we apply a VARMA specification for the variances in H_{t}
where the last term shows the asymmetry coefficient. This specification permits for spillovers among the variances of the three series and also makes the form that identical to applying for the BEKK model, permitting for direct comparisons of model performance.
fGarch is an alternate GARCH package used for comparison and control. dynlm was used to estimate linear models with lag terms easily. FinTS has an ARCH LM test function which does not necessitate constructing a VAR model.
5. RESULTS
5.1 Descriptive Analysis of Research Data
Of the companies selected for review, eventually, 66 companies from the automotive, pharmaceutical, food, cement, petrochemical and ceramic industries accepted in the Tehran Stock Exchange have had the necessary collaboration with the researcher, all of whom have at least one activity. Given that the research has been done over three years, there are a total of 132 data for each company divided into the following diagram:
According to Figure 1 the frequency of 66 companies surveyed was 21% ceramic and tile, 24% automotive & tires, 27% cement and petrochemical and 28% pharmaceutical and food industry.
The statistical results of the present study are as follows:
According to Table 1 the average of CFC is 0.747288 with the minimum 0.2440 and maximum value 1.2030. The average of CP is 0.2171with the minimum 0.126 and maximum value 2.202. The average of TCQ is 17.2812 with the minimum .98 and maximum value 274.96. The average of CBO is 0.1498 with the minimum 0 and maximum value 0.845.
In this regard, we analyze the resulting measures for the selected stocks from Tehran Stock Exchange from the beginning of 2014 to the end of 2016. By including all trading signals, the return pattern and results can be observed under an orderimbalance trading strategy. In this study, the GARCH model distinguishes between positive and negative returns and finally obtained potential asymmetry in volatility due to the direction of the returns. It has a better fit than the symmetric GARCH model for almost all financial assets (Alexander, 2009). To calculate volatility, return and order imbalance relations, we use a GARCH model. The results of returnorder imbalance relation at the confidence level of 95% are shown in Table 2.
More than 90% of the samples were significant at the confidence level of 95%, it shows that order imbalance has a significant effect on return volatility for most selected samples. Hence, the direction of effect on return volatility fails to show consistency. 48.6% of the samples show the positive significance and 42.5% show negative significance at the confidence level of 95% in 2014. 45.9% of the selected samples show the positive significance and 43.6% show negative significance at the confidence level of 95% in 2015. 42.2% of the samples show the positive significance and 43.6% show negative significance at the confidence level of 95% in 2016.
According to Table 3, the results show the significance of order imbalance relationship between order imbalance and return. It concluded that stocks with positive order imbalance coefficients account for more than 97% of the samples that show order imbalance has a positive effect on stock return. 88.1% of the samples show a positive significant relationship between order imbalance and return, and 9.7% of them show a negative significant relationship between order imbalance and return at the confidence level of 95% in 2014. 83.2% of the samples show a positive significant relationship between order imbalance and return, and 8.2% of them show a negative significant relationship between order imbalance and return at the confidence level of 95% in 2015. 79.9% of the samples show a positive significant relationship between order imbalance and return, and 9.1 % of them show a negative significant relationship between order imbalance and return at the confidence level of 95% in 2016.
According to Table 4, the results of lagged returnorder imbalance relations show that the percentage of positively significant lagged order imbalances is 88.25% and the percentage of negatively significant coefficients of lagged order imbalance is only 74.68% at confidence level 95% in 2014. Besides, the results of lagged returnorder imbalance relations show that the percentage of positively significant lagged order imbalances is 4.00% and the percentage of negatively significant coefficients of lagged order imbalance is only 22.08% at confidence level 95% in 2015. Also, the results of lagged returnorder imbalance relations show that the percentage of positively significant lagged order imbalances is 1.00% and the percentage of negatively significant coefficients of lagged order imbalance is only 8.21% at confidence level 95% in 2016. The result of study does not match with the results of Chordia et al. (2002), they show a positive and predictive relation between returns and lagged imbalances in the regression model (Chordia et al., 2002). So, the use of lagged returnorder imbalance as a predictive indicator of return needed more study to determine the direction of effect before developing an order imbalance–based trading strategy.
Table 5 summarized the results of size effect test. The result shows that there is a negative relationship exists between the market capitalization and order imbalance. Order imbalance coefficients with market capitalization and logged market capitalization, show that the positive T statistics are not consistent with the results of Chordia et al. (2002), so there is a negative relationship between market capitalization and logged market. The estimations results are provided in Table 3. It shows that most of the parameters are positive and significant indicating the existence of ARCH and GARCH effects and volatility persistence. The significant and positive parameter of h_{11} means that the current conditional variance of the return is affected by its previous variance in the previous time. i.e. the existence of volatility in the market. The covariance equation h_{21} indicates a strong positive and significant interrelation between volatilities market; the significant parameters means that the variances in the market are affected by the shock market (significant a21) and the previous volatility (conditional variance) of the market (significant b21) and vice versa. To confirm the estimation results, the figures below plot the conditional variances of the series individually and the conditional covariance of the model.
So, it shows a strong impact of market capitalization on order imbalance and other strong impacts of the global financial crisis 20142016 on the two markets which confirms the interrelationship between the financial markets including the global market.
6. CONCLUSION
In this paper, we investigate the performance of an order imbalance based trading strategy in highfrequency trading. Choosing models that are direct multivariate extensions of the GARCH models allows us to examine the forecasting performance of multivariate models. It used a multivariate GARCH type model diagonal BEKK model. The results showed strong evidence of volatility clustering in the markets. Since it concluded that HFT is a cover for different order imbalance based trading strategies with different impact on market return. The practical study is based on a dataset that currently provided. The dataset identifies HFT’s fraction of the total trading return on Tehran stock for the period from 2014 to 2016, both for a daily and a monthly frequency. It founded that the impacts of lagged order imbalance on returns can be negative for the given period. The result can be attributed to market maker behaviors because they have enough inventories to mitigate the effects of discretionary investors in tender offers. This is also confirmed by a low average return from tender offers. Order imbalance coefficients with market capitalization and logged market capitalization, show that there is a negative relationship between market capitalization and logged market. However, it would be better to analyze the nonlinear effects of volatility by GARCH models to have a clearer idea of the impact of persistence. So the BEKK GARCH model shows more significant results than other models. Finally, according to the result of a study there observed a relationship between return and order imbalance, it concluded that order imbalance is a proper measure for predicting future returns. Indeed, order imbalance could be proper measures for predicting returns in HFT.