Abstract
The stock market index can be forecasted in two ways --- either through taking those external factors that influence movements in the index or by basing one’s predictions on the previous values of the index. The current study has used the method described later by employing the Box-Jenkins methodology --- a method famously used by most researchers while conducting ARIMA modeling--- by taking past figures of KSE 100 Index. Quarterly figures of the Index were, therefore, taken for 22 years from August 1995 to October 2017 that translated into 90 observations. Results revealed that the forecasting model used in the study did well in anticipating returns in the short-run. The findings of the study can be consumed by investors, particularly short-term, in deciding when, and when not, to risk their hard-earned funds at Pakistan Stock Exchange.
Key Words
ARIMA, Box-Jenkins Methodology, PSX 100 Index, Prediction, Stationarity
Introduction
The capability to forecast the future can never be underrated when one talks about investments. Since, of course, the future is always not certain, investors often seem to be pondering about discovering the suitable time to invest their funds. Prediction of the stock market index is no different from forecastingother kinds of investment as many factors play their role in the returns. A stock market index movement tells us in which direction a given economy is heading towards. Owing to these reasons, investors keep looking at the index to observe what is going on in the stock market.
A time-series variable can be forecasted in two ways. One is to forestall the direction the variable is expected to go in keeping with all the factors that have a bearing on it. The other method is to predict its future values based on its lagged observations. In today’s era, researchers are increasing their reliance on using the method mentioned later for forecasting their variables and this study has also employed the method of relying on the lagged values of our variable of interest. The current study has also employed the same methodology known as the Autoregressive Integrated Moving Average (hereinafter referred to as the ARIMA) technique of forecasting time series.
ARIMA technique has been long in practice but the method was more popularized by Box and Jenkins (1970) after they discovered a method for effectively using ARIMA. The model is now often employed using what is commonly known as the Box-Jenkins methodology.
The study serves two objectives, i.e., to investigate whether stock returns could be adequately forecasted every quarter using the ARIMA modeling, and secondly, to explore how many previous quarters of data will be engaged in efficiently forecasting the current (or future) value of the index or the returns. It is hoped that by using the results of the current study, short term investors may get some insights into how the stock market behaves in the short run.
Literature Review
Forecasting of time series variables has always remained challenging. Researchers over time have put their efforts to efficiently forecast variables of their interest. ARIMA models also have a history of being employed by many academicians and professional investors. A discussion of some of the previous work follows:
Meyler, Kenny, and Quinn (1998) employed the ARIMA technique for anticipating inflation in Ireland and found that the model had a decent prediction capability. Jarrett (1990) used ARIMA for corporate earnings estimation but concluded that the model gave no better results than the conventional models. Another attempt was made by Raymond (1997) who predicted prices of real estate using ARIMA. He found the model helpful in envisaging the direction the prices of the real estate were going. The model was also used by Contreras et al (2003) for anticipating electricity prices in the country of Spain and the city of California who also found a good short-run prediction power of the model. Gilbert (2005), on the other hand, used the ARIMA model for processes related to the supply chain. He found that all those supply chain-related variables including inventories, demands, orders placed, and lead times could be easily forecasted using the model. Among the users of ARIMA was Guha (2016) too who endeavored to estimate the prices of gold in India and who found that the prices could be easily anticipated in the short run.
The production of different crops has also been forecasted by researchers using the ARIMA model. For instance, Padhan (2012) checked the productivity of some 34 crops in India and found that the tea crop was the most predictable whereas the papaya crop was not predictable at all. Similarly, the production of Sugarcane was predicted by Manoj and Madhu (2014) in some parts of India employing the ARIMA model and discovered that a good forecast was made by the model for around five years. Another attempt was made by Hamjah (2014) who predicted the production of rice, using ARIMA, in Bangladesh and found the model adequately successful. The production of the major crops of an Indian state named Karnataka was estimated by Jadhav, Reddy, and Gaddi (2017) in a paper in which they were able to adequately predict the production of these crops for the next three years.
There have been a few studies conducted to anticipate stock prices or stock market index engaging the ARIMA model. For instance, Mondal, Shit, and Goswami (2014) did a big attempt by including shares of as many as 56 companies of India for their future price anticipation using the ARIMA model. They found that for around 85% of shares included in their study the prediction carried out through the Box-Jenkins method was very accurate. Similar studies were conducted by Adebiyi, Adewumi, and Ayo (2014) and Banerjee (2014) who estimated stock returns using ARIMA and found the model to be decently capable. We now give a brief description of how the Box-Jenkins methodology works in practice.
The Box-Jenkins Method
The Box-Jenkins method offers a way of using ARIMA modeling for time series variables. Developed by Box and Jenkins (1970), the method helps us in identifying the number of the previous values of our variable of interest as well as the number of lagged values of the error term that our variable depends upon. The method works better when there is a larger number of observations for a given time series. However, 50 observations are considered to be the minimum acceptable number for a given variable below which the model is not likely to give meaningful results (Chatfield, 1996).
The Box-Jenkins methodology consists of three steps. In the first step known as the model identification step, the researcher inspects the autocorrelation and partial correlation functions to explore the number of lagged values of the variable and that of the error term that significantly affects the variable. The second step known as model estimation involves estimating the model identified in the first step. A few other models that could give better results are also estimated to have something for comparison. The third and final step includes the diagnostic testing in which the models estimated are compared based on the information criterion values, adjusted R2 value, and the number of insignificant parameters. The Box-Jenkins method requires the selection of the model that has the lowest information criterion values, the highest adjusted R2 value, and the least number of insignificant parameters.
Research Methodology
The study at hand uses the time series data of only one variable ---the stock market index. Therefore, the univariate ARIMAtechnique has been employed to predict the variable’s future values. Needless to mention that a stationary time series variable (the one which is not integrated) needs an ARMA process. In other words, it does not need to be made stationary since it is already in a stationary position. In its standard form, an ARMA process as taken from Asteriou& Hall (2007) is as follows:
Yt = ?1Yt-1 + ?2Yt-2+ - - - +?pYt-p+ ?t + ?1?t-1 + ?2?t-2 + - - - +?q?t-q
In the equation above, Ytdepicts the explained variable to be predicted, Yt-1 toYt-p are the lagged or autoregressive terms of Yt, ?t is the error term, ?t-1 to?t-q are the lagged or moving average terms, ?1 to?p are the autoregressive coefficients, and ?1 to?p are the moving average coefficients.
In most cases, time series variables happen to be non-stationary in which case we need to differentiate them for enough time to make them stationary. A variable achieves stationarity only when its long-run mean becomes constant and its covariance becomes time-invariant (Gujarati & Porter, 2004). Since in our case the dependent variable, i.e., the stock index was also non-stationary, we took quarterly returns by taking its first difference and then dividing it over the lagged value of the variable. Therefore, an ARIMA model, i.e., the one that allows for the variable to be integrated, instead of an ARMA model was employed in the study.
For analysis, quarterly figures of Pakistan Stock Exchange were engaged for 22 years from August 1995 to October 2017 which produced90 observations making the sample large enough to be considered for ARIMA analysis (Chatfield, 1996).
Results and Findings
The graphical analysis, as well as the unit root test of the variable stock market index, showed obvious trends in the data. Therefore, quarterly returns were computed, and then the data was checked for stationarity again. As can be seen in figure 1, the returns of the KSE 100 index were stationary.
Figure 1
The Stationary KSE 100 Index Quarterly Returns
The unit root test of KSE 100
index returns also gave a t-statistic of -8.367 which was highly significant (p-value
= .000) and which left no doubt that the returns were made stationary (see
table 1).
Table 1. ADF Test for KSE 100 Index
Quarterly Returns
Null Hypothesis: KSE 100 Index Quarterly Returns Has A Unit Root |
||||
|
|
|
t-Statistic |
p-value |
Augmented Dickey-Fuller test statistic |
-8.367 |
.000 |
||
Test critical values: |
1% level |
|
-3.506 |
|
|
5% level |
|
-2.895 |
|
|
10% level |
|
-2.585 |
|
Model Identification
After stationarity has been
induced in the variable, we proceed by employing the Box-Jenkins methodology.
In the first stage, we try to identify the most appropriate model by looking at
the correlogram to check for the number of autoregressive and moving average
terms. The correlogram of quarterly returns of the KSE 100 Index is given in
table 2.
Table 2. Correlogram of
KSE 100 Index Quarterly Returns
Autocorr. |
Partial
Corr. |
|
AC |
PAC |
Q-Stat |
Prob |
. |* | |
. |* | |
1 |
.105 |
.105 |
1.014 |
.314 |
. |* | |
. |* | |
2 |
.129 |
.119 |
2.567 |
.277 |
|
|
|
|
|
|
|
. |. | |
. |. | |
3 |
.030 |
.005 |
2.650 |
.449 |
. |. | |
. |. | |
4 |
.037 |
.019 |
2.782 |
.595 |
.*|. | |
.*|. | |
5 |
-.126 |
-.139 |
4.321 |
.504 |
.*|. | |
.*|. | |
6 |
-.172 |
-.162 |
7.222 |
.301 |
. |. | |
. |* | |
7 |
.073 |
.141 |
7.742 |
.356 |
. |. | |
. |* | |
8 |
.049 |
.085 |
7.978 |
.436 |
. |. | |
. |. | |
9 |
-.002 |
-.024 |
7.978 |
.536 |
. |. | |
. |. | |
10 |
-.010 |
-.039 |
7.988 |
.630 |
. |. | |
. |. | |
11 |
.073 |
.023 |
8.544 |
.664 |
. |. | |
. |. | |
12 |
.032 |
.026 |
8.654 |
.732 |
. |. | |
. |* | |
13 |
.058 |
.106 |
9.012 |
.772 |
Looking at the correlogram in the aforementioned
table, it is visible that the decay
is beginning at lag 2 for both the autocorrelation and the partial
autocorrelation functions. This hints towards an ARIMA (2, d, 2)
model. We will, nonetheless, also check a few other models to see whether ARIMA (2, d, 2) is the best solution to our case or
could there be another possible alternative to it.
Model Estimation
The model identification stage
of the Box-Jenkins methodology has prescribed ARIMA (2, d, 2)
to be employed. So we start by estimating ARIMA
(2, d, 2). We will later
compare its results with other possible configurations of ARIMA.
Table 3. OLS Estimation using ARIMA (2, d,
2), Model
Explained Variable: KSE 100 Index
Quarterly Returns |
||||
Estimation Method: Ordinary Least
Squares |
||||
Observations included: 88 after
adjustments |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
p-Value |
C |
.052 |
.019 |
2.70 |
.008 |
AR(1) |
.023 |
.426 |
.053 |
.958 |
AR(2) |
.480 |
.387 |
1.239 |
.219 |
MA(1) |
.017 |
.445 |
.039 |
.969 |
MA(2) |
-.426 |
.398 |
-1.069 |
.288 |
R-squared |
.057 |
Akaike info. criterion |
-.984 |
|
Adj. R-squared |
.011 |
Schwarz Bay. criterion |
-.842 |
|
|
|
Hannan-Quin criterion |
-.927 |
The results of
ARIMA (2,
d, 2) are presented in table 3. Surprisingly, this model has all its
coefficients highly insignificant. The model also has a worryingly low adjusted
R2 value. Prescribed by
the Box-Jenkins methodology, this model couldn’t at all be the best possible
solution for our problem, however. The only way to find a better model is to
try other possibilities one by one and compare them based on their adjusted R2, information criterion
values, and the number of insignificant parameters that they have. We now try
the much common ARIMA (1, d,
1) model to see whether it
outperforms the one suggested by the Box-Jenkins method.
Table 4. OLS Estimation using ARIMA (1, d,
1), Model
Explained Variable: KSE 100 Index
Quarterly Returns |
||||
Estimation Method: Ordinary Least
Squares |
||||
Observations included: 88 after
adjustments |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
p-Value |
C |
.052 |
.019 |
2.784 |
.006 |
AR(1) |
.735 |
.227 |
3.236 |
.002 |
MA(1) |
-.693 |
.252 |
-2.748 |
.007 |
R-squared |
.055 |
Akaike
info. criterion |
-1.02 |
|
Adj. R-squared |
.033 |
Schwarz
Bay. criterion |
-.936 |
|
|
|
Hannan-Quin
criterion |
-.986 |
Table 4 returns the output of ARIMA
(1, d, 1). Astonishingly,
this model is better in every respect than the Box-Jenkins prescribed ARIMA (2, d, 2). For one thing, all of the
coefficients of ARIMA (1, d,
1) are significant in contrast with ARIMA (2, d, 2) in which none of the coefficients
were statistically significant. Also, all of the information criteria for ARIMA (1, d, 1) have much lower values than those for
ARIMA (2, d, 2). Finally, the adjusted R2 value is also larger for ARIMA (1, d, 1) than the model proposed by the
Box-Jenkins method. This inevitably makes ARIMA
(1, d, 1) much superior to ARIMA (2, d, 2). However, before we give our final
word about the best possible ARIMA
configuration for our dependent variable, i.e., the quarterly returns of the
KSE 100 index, we need to check all other possible models as well.
The next and the final stage of
the Box-Jenkins methodology
involves a comparison of the various alternatives to seek the most practical
and parsimonious solution to our problem.
Diagnostic Checking
Analysis
in the former segment indicated ARIMA (2, d,2) to be the appropriate model for our
variable. It was, however, shown that ARIMA (1,
d, 1) is better than ARIMA (2, d,
2) in all respects. We will want to
know whether we can have an even better solution to our problem. We will,
therefore, estimate other models as well for this purpose. The following table
presents the comparative results of a few other models.
Table 5. Contrasting
ARIMA models
ARIMAmodel |
Adjusted R2 |
AIC |
SBC |
HQC |
Insignificant lags |
ARIMA (1, d,
1) |
.033 |
-1.020 |
-.936 |
-.986 |
None |
ARIMA (2, d,
1) |
.012 |
-.996 |
-.882 |
-.950 |
One |
ARIMA (1, d,
2) |
.021 |
-.998 |
-.885 |
-.952 |
One |
ARIMA (1, d,
3) |
.016 |
-.982 |
-.841 |
-.924 |
Two |
ARIMA (2, d,
2) |
.011 |
-.984 |
-.842 |
-.927 |
All |
ARIMA (3, d,
3) |
.077 |
-1.022 |
-.822 |
-.942 |
Two |
ARIMA (3, d,
4) |
.104 |
-1.042 |
-.813 |
-.949 |
Three |
ARIMA (4, d,
4) |
.083 |
-1.119 |
-.861 |
-1.015 |
Four |
A comparison
of the different models is given in table 5. In terms of the number of
insignificant parameters, one can consider ARIMA (2, d, 2) to be the worst
choice having all lags insignificant. As per the adjusted R2, however, ARIMA
(3, d, 4) takes the lead with
a value of 10.4%. The model also has the minimum Akaike Information Criterion
(AIC) value of -1.042. The Hannan-Quin Criterion (HQC) value is lowest for ARIMA (4, d, 4) and the Schwarz-Bayesian Criterion
(SBC) value, which plays the most important role in selecting the correct
model, is the lowest for ARIMA (1, d,
1).
If adjusted R2 is to be considered the
decisive factor for model selection, ARIMA
(3, d, 4) is to take the lead
followed by ARIMA (4, d,
4). However, the former has three
insignificant parameters whereas the latter has four. If, however, the three
information criteria are given priority in the choosing the right model, ARIMA (1, d, 1) gets the edge as the model has the
lowest SBC, the second-lowest HQC value after the over-parameterized ARIMA (4, d, 4) and the fourth-lowest AIC value. The important point,
nonetheless, is that all those models having lower HQC and AIC values than ARIMA (1, d, 1) have insignificant parameters in the
first instance, and also have much more lags involved violating the very
principle of parsimony advocated by the Box-Jenkins methodology. It is
therefore held that ARIMA (1, d,
1) is the most appropriate model for
our variable for the following reasons; a) the model has the lowest SBC than any other model, b) the model
has all its parameters highly
significant, c) the model has the second-lowest HQC value, and d) the model is substantially parsimonious than the
ones offering larger adjusted R2
and lower AIC or HQC values.
Discussion
We had found in the analysis section that ARIMA (1, d, 1) is the most fitting model for our variable. This very ARIMA configuration has remained very popular in the literature with many studies in the past finding the same model appropriate for their variables. There have been studies that found bigger models (the ones with more lags) to be more helpful in forecasting their variables. In general, however, ARIMA models have been considered to be reasonably successful in predicting a given variable’s future values. For instance, some of the studies meant for predicting stock prices using ARIMA modeling are the ones conducted by Mondal, Shit, and Goswami (2014), and Adebiyi, Adewumi, and Ayo (2014).
A few researchers have also used the ARIMA technique for anticipating crop production. These studies include the ones conducted by Padhan (2012), Manoj and Madhu (2014), Hamjah (2014), and Jadhav, Reddy, and Gaddi (2017). In nutshell, many attempts have been made by researchers using ARIMA for forecasting their variables and our study also, like previous studies, finds this method of prediction very helpful.
Conclusion
Around the world, security markets particularly stock markets are deemed to be the emblems of a given economy’s financial prosperity. They indicate investment prospects available in an area. A stock market that portrays a constant or a continual bullish behavior gains investors’ confidence, in turn, toss their investments without much hesitation. However, in a struggling or an uncertain market, investors take steps very carefully concerning throwing their money. Of course, they need to be able to predict what is to happen in the next few weeks and/or months. ARIMA modeling offers a way for investors to make their predictions efficient in the short run. We used the model for forecasting KSE 100 index quarterly and it was found that the index could be very effectively anticipated based on a one-quarter previous value of the index and one-quarter previous value of the error term. Investors can use the findings of this study to track future changes in the stock market so that they can catch the most appropriate time to invest.
References
- Adebiyi, A., Adewumi, A., & Ayo, C. (2014, March). Stock Price Prediction Using the ARIMA Model. Paper presented at the 2014 UKSim-AMSS 16th International Conference on Computer Modeling and Simulation, Cambridge University, United Kingdom. Retrieved from
- Asteriou, D., & Hall S. (2007). Applied Econometrics, Revised Edition. Palgrave Macmillan, New York, USA.
- Banerjee, D. (2014). Forecasting of Indian stock market using the time-series ARIMA model. Paper presented at the 2nd International Conference on Business & Information Management (pp. 131-135). Durgapur, India. IEEE.
- Box, G., & Jenkins, G. (1970). Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day, California, USA
- Chatfield, C. (1996). The Analysis of Time Series, 5th ed., Chapman & Hall, New York.
- Contreras, J., Espinola, R., Nogales, F., & Conejo, A. (2003). ARIMA Models to Predict Nextday Electricity Prices. IEEE Transactions on Power Systems, 18(3), 1014- 1020.
- Gilbert, K. (2005). An ARIMA Supply Chain Model, Management Science, 51(2), 305-310.
- Guha, B., &Bandyopadhyay, G. (2016). Gold Price Forecasting using ARIMA Model. Journal of Advanced Management Science, 4(2), 117-121.
Cite this article
-
APA : Afeef, M., Ali, N., & Khan, A. (2018). Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index. Global Social Sciences Review, III(III), 466-476. https://doi.org/10.31703/gssr.2018(III-III).26
-
CHICAGO : Afeef, Mustafa, Nazim Ali, and Adnan Khan. 2018. "Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index." Global Social Sciences Review, III (III): 466-476 doi: 10.31703/gssr.2018(III-III).26
-
HARVARD : AFEEF, M., ALI, N. & KHAN, A. 2018. Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index. Global Social Sciences Review, III, 466-476.
-
MHRA : Afeef, Mustafa, Nazim Ali, and Adnan Khan. 2018. "Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index." Global Social Sciences Review, III: 466-476
-
MLA : Afeef, Mustafa, Nazim Ali, and Adnan Khan. "Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index." Global Social Sciences Review, III.III (2018): 466-476 Print.
-
OXFORD : Afeef, Mustafa, Ali, Nazim, and Khan, Adnan (2018), "Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index", Global Social Sciences Review, III (III), 466-476
-
TURABIAN : Afeef, Mustafa, Nazim Ali, and Adnan Khan. "Tracing Stock Returns on Quarterly Basis: The Case of KSE-100 Index." Global Social Sciences Review III, no. III (2018): 466-476. https://doi.org/10.31703/gssr.2018(III-III).26