Testing Crude Oil Market Efficiency Using Artificial Neural Networks Manel HAMDI International Finance Group Tunisia, Faculty of Management and Economic Sciences of Tunis, Tunisia, El Manar University, Tunis cedex, C.P. 2092, El Manar Tunisia Phone: +21697551459 Email:
[email protected]
Abstract This paper evaluates the weak-form efficiency of the crude oil markets using the artificial neural network (ANN) model. Based on the daily historical data of the West Texas Intermediate (WTI) crude oil spot price over the period (02 January 1986- 31 December 2013), the model was trained using backpropagation algorithm. The output of the neural network represents the predicted prices which are considered as trading signals (buy or sell) for investors. Furthermore, an empirical investigation of profitability has been conducted. Compared to a naïve trading strategy as the Random Walk (RW), the profitability results show that ANN model outperformed the RW model. Therefore, the crude oil market is inefficient according to the Efficient Market Hypothesis (EMH). From these findings, we can argues that is possible to earn excess profits by making trading strategy based on the information embedded in the historical crude oil prices. Finally, the proposed neural networkbased approach becomes an interesting trading rule for the practitioners to make or to support their investment decisions.
JEL Classification: G14, Q47, C45.
Keywords: Efficient market hypothesis, crude oil price prediction, artificial neural network.
1
1. Introduction
Efficient Market Hypothesis (EMH) is of special interest for the financial institutions and organisms. According to Fama (1970, 1991), a market is efficient in a weak form if the asset price reflect immediately all new available and relevant information. Despite it’s an interesting topic of research, there is no numerous studies that focused on this field especially for the oil commodity. To first, Tabak and Cajueiro (2007) have investigated the crude oil markets efficiency, including WTI and Brent. The authors have estimated the fractal structure of these time series using the time-varying Hurst exponent test for a data sample of daily closing prices covering the period (May, 1983 - July 2004). Their results pointed out the presence of a fractal structure in oil price time series. Moreover, they showed that crude oil market efficiency is changing over time and they have become more efficient over time. According to these authors, the WTI crude oil price seems to be more weak form efficient than Brent prices. Similar conclusions were provided by Alvarez-Ramirez et al. (2008), which concluded that for long times the crude oil market is consistent with the weakform efficient market hypothesis (WFEMH hereafter). The authors have examined the autocorrelation of international crude oil markets by estimating the Hurst exponent dynamics for several typical oil mixtures (Europe Brent, WTI Cushing and Dubai) commodity time series returns for the sample period (1987-2007). In another research, Maslyuk and Smyth (2008) have also analyzed the WFEMH in the crude oil markets based on unit root tests. Using weekly spot and future prices including WTI and Europe Brent (January 1991-December, 2004), the authors showed that future spot and futures prices cannot be predicted based on historical price data and concluded that crude oil markets seem to be weakly efficient. In a more recent research, Charles and Darné (2009) have investigated the WFEMH by testing the random walk hypothesis from variance ratio tests. More precisely, the authors have employed a non-parametric variance ratio tests suggested by Wright (2000), Belaire-Franch and Contreras (2004) as well as the wild-bootstrap variance tests developed by Kim (2006). Using daily closing spot prices for two crude oil markets (US WTI and the UK Brent) over the period (June 1982- July 2008), the authors revealed that the Europe Brent crude oil market is weak-form efficiency while the WTI crude oil market seems to be inefficient over the period (1994-2008). In another work, Alvarez-Ramirez et al. (2010) used a lagged version of the detrended fluctuations analysis to study the efficiency of crude oil market and to detect delay effects in spot WTI prices autocorrelations over the period 1986 to 2009. Based on their 2
empirical findings, they concluded that negative or positive autocorrelations can be concealed by delay effects. Using weekly spot FOB crude oil prices for four OPEC members as also represent four countries of the golf cooperation council (GCC): Kuwait, Qatar, Saudi Arabia, and the United Arab Emirates (UAE); Arouri et al. (2010) applied a state space model to prove strong evidence of short-term predictability in crude oil price movements over time. Nevertheless, the hypothesis of convergence towards weak-form informational efficiency cannot be verified for all markets. More recently, Ortiz-Cruz et al. (2012) employed multiscale entropy analysis techniques to investigate the informational efficiency of the crude oil markets. Results based on daily closing spot prices of WTI running from January, 1st, 1986 to March 15, 2011 shown that crude oil market is an informational efficiency market overall the period except the early 1990s and the late 2000s US economic recessions. In this context, neural networks-approach is applied for market forecasting and trading issue. In section 2, we describe the ANN proposed model to verify EMH and the data sample used for this purpose. Moreover, an empirical investigation and results are explored in the same section. Finally, we conclude in section 3.
2. Test of WTI crude oil market weak form efficiency : Empirical investigation 2.1. Data sample description A sample of WTI crude oil spot price (see Fig.1), for the period running from 2nd January, 1986 to December 31, 2013 ; is used to predict the future value of WTI crude oil price. The daily data was provided by US Energy Information administration website. 80% of the data set (5651 observations) represents the training sample that is used to estimate the parameters of network (synaptic weights and bias) and the remainder (1413 observations) is the checking sample which is designed to test the predictive ability of network.
Figure 1. The crude oil spot price of WTI (Time:7064 working day) 3
As illustrated in figure 1, the crude oil market is characterized by high volatility and also marked by outstanding peaks and falls due to the effect of the unpredictable events (wars, embargoes, crisis, revolution…) which have occurred in the history of oil market. Corresponding to Ghaffari and Zare (2009), we introduce a smoothing algorithm in order to reduce the effect of unforeseen short term disturbances of oil market while maintaining the main and long term characteristics of the dynamic of crude oil market. In this study, we take the default method of smoothing provided by the Matlab software packages as the 5th order of moving average filter (see Fig.2)
Figure 2. The smoothed crude oil spot price of WTI (Time:7064 working day) The observed actual and smoothed times series of WTI ranging from January 02, 1986 to December 31,2013 are depicted in Fig.3. Thus, the error (see Fig.4) is the difference between actual and smoothed price value. 150
WTI crude oil price (US$ per barrel)
Actual crude oil price Smoothed crude oil price
100
50
0
0
1000
2000
3000
4000 5000 Working day
6000
7000
8000
Figure 3. Actual vs. Smoothed crude oil spot price of WTI (Time:7064 working day)
15
10
Error
5
0
-5
-10
0
1000
2000
3000
4000 W orking day
5000
6000
7000
8000
Figure 4. Error between actual and smoothed crude oil spot price of WTI (Time:7064 working day) 4
In order to demonstrate the utility of the proposed smoothing function, we choose to illustrate the two plot of actual and smoothed price (see Fig.5) only over the month of July/2008, during which the price of crude oil reached its highest value (145.31 US$/barrel). After introducing the smoothing procedure, the price has decreased of 3.81 US$/barrel. We can conclude that the smoothing procedure has advantage to reduce the short term noises effect.
150 X: 3 Y: 145.3
Actual oil price Smoothed oil price
X: 9 Y: 145.2
WTI crude oil price (US$ per barrel)
145
140 X: 3
X: 9 Y: 141
Y: 141.5
135
X: 14 Y: 131.4
130
X: 21 Y: 126.7 X: 14 Y: 128.2
125
X: 21 Y: 124.6
120
0
5
10
15
20
25
July 2008
Figure 5. Actual vs. Smoothed crude oil spot price of WTI (July 2008)
2.2. Artificial neural network model 2.2.1. ANN Structure Artificial neural network is a nonlinear model inspired from human brain functioning by adopting the same mode of acquiring knowledge through learning process. The standard design of ANN consists generally of an input layer (that contains, in our case, the historical smoothed prices (Pt) of WTI crude oil), one or more hidden layers and an output layer (that presents the predicted future prices (Pt+1) of WTI crude oil) interconnected among them as depicted in Figure 6.
Figure 6. Fully interconnected neural network with one hidden layer
5
The state of the output neuron is determined by the following formula :
m
N
wkj(2) g1
Pt 1 = g 2 j =0
b1
b2
(7)
i =0
Where; Pt are the inputs of the network; number of nodes in the hidden layer;
wij(1) Pt
k
N
is the total observations of input prices;
m
is the
is the number of units in the output layer; g is the
Transfer/activation function; w (1) is the weights matrix of the hidden layer; w ( 2 ) is the weights matrix of the output layer; b1 and b2 are the bias vectors of the hidden layer and output layer, respectively. 2.2.2. ANN Topology A lot of expert knowledge and several combinations of experiments are needed to obtain an optimal ANN topology because there are no scientific rules to find the best configuration of ANN for a particular problem (Lackes et al., 2009). Then, several factors must be controlled to select the optimal network architecture : -
The number of hidden layers
In our experiment, we used one or/and two hidden layers that is the ideal architecture for providing a good forecasting results (Zhang et al., 1998). -
The choice of activation function
According to Haykin (1999), the sigmoid and the hyperbolic tangent functions are the mostly used in financial applications. In this study, we use the hyperbolic tangent function as transfer function (g1) of the network similarly to the recent financial applications (Yonaba et al, 2010 ; Jammazi and Aloui, 2012). -
Learning rate & training algorithms
The network was trained with backpropagation algorithm, precisely with LevenbergMarquardt algorithm as it’s the fastest training function (Kulkarni and Haidar, 2009). After saveral experiments, we choose the learning rate equal to 0.01 because based on this value we found the best solution. We note that, also Haidar et al. (2008) has used the same value of learning rate. -
The number of hidden nodes
There are no universal standards to define the number of hidden neurons. The ideal is to use the least amount of units which allow to achieve the best prediction results, as too many nodes 6
could deduce an overfitting problem and too few could cause an underfitting problem (Kaastra and Boyd, 1996).
In this study, we follow the similar approach employed by Rosiek and Batlles (2010) and Haidar et al. (2008) to determine the number of hidden neurons. This approach consists of training and testing the network to a fixed number of iterations (1000 iterations in our experiment), beginning with small number of units and add the number gradually until the optimal number of nodes is reached. In this work, we try with a maximum number of hidden neurons equal to 20 and started with one hidden node. According to Table 1 and Figure 7, the best results of out-of-sample (minimum of MSE
1 N
2
n
Target i Output i
) is obtained with
i 1
10 hidden neurons. Nbre of hidden nodes 1 2 3 4 5 6 7 8 9 10
MSE value 1,8426294 1,8986168 1,9962819 2,1632104 2,1104935 2,1625132 2,2531985 2,3961847 2,5702552 1,3171001
Nbre of hidden nodes 11 12 13 14 15 16 17 18 19 20
MSE value 2,0841049 2,2036816 2,204476 2,9094379 1,8470987 2,5390865 1,8554263 2,7154749 2,3407171 1,7725174
Table 1. MSE statistics vs. the increase of hidden neurons
3 2.8 2.6
MSE
2.4 2.2 2 1.8 1.6 1.4
0
2
4
6
8 10 12 Nbre of nodes
14
16
18
20
Figure 7. MSE evolution vs. the increase of hidden neurons
7
As conclusion, the proposed model is a single layer backpropagation neural network with ten hidden neurons, the hyperbolic tangent as activation function in the hidden layer and the linear transfer function (g2) in the output layer. 2.3.Empirical results and Interpretation Once the training process was completed, we proceed to judge the quality of prediction of ANN model. To do this, we will subsequently compare between the estimated (predicted) and the real (actual/target) values of crude oil price. Before starting this step, we must verify the quality of ANN training. Therefore, the performance of a trained network can be measured by the correlation coefficient (R-value) derived from the regression analysis presented in Fig. 8.
Figure 8. Comparison between the estimated and actual values of crude oil price : in training part According to Fig. 8, the dashed line indicates the best linear fit (ANN outputs equal to targets). The circles represent the data points and the solid line colored blue reflects the best fit between network responses and targets. In this empirical study, it is difficult to distinguish the best linear fit line from the perfect fit line because the fit is so good. Moreover, more the R-value is close to 1 more there is perfect correlation between targets and outputs. In our study, the R-value (0.99972) is very close to 1, which indicates a good fit. After checking the quality of the learning network, we analyse the predictions results based on the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE) performance measures, and also based on the correlation coefficient (R-value) as a decisive factor in this specific problem see (Table. 2).
8
MSE
RMSE
1.3171
MAE
1.1476
0.8740
R 0.9988
Table 2. Performance criteria As accuracy is the most important criteria to judge the forecasting models, we select the two main metrics RMSE and MAE which can be expressed as follow :
RMSE
MAE
1 N
2
n
Target i Output i
(1)
i 1
1 N
n
Target i Output i
(2)
i 1
Where N= (i=1…..1413) is the total of checking sample.
According to equation 1 and 2, the RMSE and MAE values of the proposed neural network are found as 1.1476 and 0.8740, respectively. The MAE is the absolute value of the difference between target and output values divised by the total observations in test part, therefore the value (0.8740) reflects the noticeable accuracy of neural network model. The RMSE will always be larger or equal to the MAE (Caner et al., 2011), in our case the RMSE value (1.1476) is slightly larger than the MAE value. This finding confirm the performance of the network in forecasting task. Another indicator was chosen to verify this point as the R-value. Similarly to the regression analysis carried out in the learning phase, the same analyse has conducted in the testing phase (see Fig. 9).
Figure 9. Comparison between the estimated and actual values of crude oil price : in checking part 9
The network responses are plotted versus the targets in Fig. 9. Three variables are returned by the analysis regression plot. The first variable x represents the slope of the best linear regression relating targets to network outputs. The slope would be 1 If there were a perfect fit (outputs exactly equal to targets). The second variable y is the intercept of the best linear regression relating targets to network outputs, and the intercept constant would be around 0. Finally, the third variable is the R-value which would be very close to one. Our results (x= 1.0012, y=-0.5061 and R=0.9988) show that ANN is a good prediction model.
The following figure illustrates the plot of comparison between actual and forecasted crude oil price. 160 Predic tive value Real value
WTI crude oil price (US$ per barrel)
140
120
100
80
60
40
20
0
500 Chec k ing part
1000 (1413 obs ervations )
1500
Figure 10. A plot of actual and forecasted crude oil price : in checking part
In order to illustrate our empirical results, we have arbitrarily selected 30 consecutive trading days from our test sample (see Table. 1). Date
Predictive value
Real value
Error
févr 13, 2012
100,2859769
100,318
0,03202314
févr 14, 2012
100,6994656
100,808
0,1085344
févr 15, 2012
101,1467217
101,726
0,5792783
févr 16, 2012
102,3083843
102,824
0,51561568
févr 17, 2012
103,9067445
103,858
-0,04874452
févr 21, 2012
105,0155665
104,982
-0,03356653
févr 22, 2012
105,7721108
106,394
0,62188922
févr 23, 2012
106,8475541
107,438
0,59044593
févr 24, 2012
108,1399367
107,58
-0,55993669
févr 27, 2012
108,339128
107,798
-0,54112803
févr 28, 2012
108,6459273
108,062
-0,58392733
10
févr 29, 2012
109,0106511
107,52
-1,49065113
mars 01, 2012
108,2547535
107,162
-1,09275348
mars 02, 2012
107,7618855
106,786
-0,97588546
mars 05, 2012
107,2833911
106,602
-0,68139107
mars 06, 2012
107,0700915
106,18
-0,8900915
mars 07, 2012
106,6394228
106,324
-0,3154228
mars 08, 2012
106,777172
106,252
-0,52517198
mars 09, 2012
106,707126
106,65
-0,05712604
mars 12, 2012
107,1242766
106,516
-0,60827661
mars 13, 2012
106,9756504
106,224
-0,75165041
mars 14, 2012
106,6805212
106,15
-0,53052119
mars 15, 2012
106,6118926
106,5
-0,11189259
mars 16, 2012
106,9584571
106,296
-0,66245707
mars 19, 2012
106,7496508
106,572
-0,17765077
mars 20, 2012
107,03676
106,53
-0,50676002
mars 21, 2012
106,9907919
106,41
-0,58079191
mars 22, 2012
106,8639597
106,206
-0,65795974
mars 23, 2012
106,6636039
106,534
-0,12960394
mars 26, 2012
106,9951347
106,24
-0,75513473
Table 3. The difference between actual and forecasted WTI crude oil price for 30 consecutive trading days
According to the above table, the neural network is a perfect forecasting model, as the maximum difference between actual and forecast is 1.5$.
These findings prove the perfect ability of the neural network model to predict the crude oil market, however, an efficient market would have no predictability. Therefore, the crude oil market is inefficient in the Fama sence.
11
Corresponding to Shambora and Rossiter (2007), the favorite way of testing predictability is to see if the model would have been more profitable than a naïve model as the random walk (RW). To verify the conclusions drawn from the neural analysis, we proceed to an analysis of profitability. To compute profitability we took each day’s prediction and made a “trade” based on this prediction. For example, if the model predicts a down day, investor would sell one unit of oil short. we add each day’s profit (loss) in percentage terms to come to the grand total percentage gain or loss. We did this with each of the trading strategies (see Table 4).
Period
2008
2009
2010
2011
2012
2013
Total period (20082013)
Trading
ANN
0.9852
1.2123
1.5698
0.9635
1.1114
0.8457
6.6879
strategies
RW
-0.1669
0.5234
0.8126
0.7391
0.3335
-0.2487
1.9930
Table 4. Analysis of profitability (%)
According to Table 4, the active ANN-based strategy far out-performed the naïve trading strategies, as the profitability over the all test period equal to 668.79% whereas 199.30% for ANN and RW, respectively. Moreover, by examining the profitability on year to year basis ; we can conclude that there are no losing years for ANN while two losing years for RW.
Overall we conclude that ANN is the best in terms of both predictability and profitability, therefore, we can reject the EMH due the presence of arbitrage opportunities among crude oil energy markets.
3. conclusion
In this paper, we applied an ANN to test the EMH. As inputs we introduced a smoothed crude oil price time series in order to reduce the noise effects. Moreover, we proceeded in this study to determine the optimal ANN design to obtain the best predictions results. In term of 12
predictability, the model prove high accuracy therefore, we can reject the EMH. In another hand, the network responses (the forecasted prices) are considered as trading signals (buy or sell) which are used in profitability analysis. Our empirical results shown strong evidence of short-term predictability in crude oil price variations and the weak-form EMH cannot be verified. These findings are compatible with the results of Elder and Serletis (2008), AlvarezRamirez et al. (2008) and Shambora and Rossiter (2007), who find evidence of oil price predictability for short time horizons.
13
References
Alvarez-Ramirez, J., Alvarez, J., Rodriguez, E. (2008). Short-term predictability of crude oil markets: a detrended fluctuation analysis approach. Energy Economics, 30, 2645-2656. Alvarez-Ramirez, J., Alvarez, J., Solis, R. (2010). Crude oil market efficiency and modeling: Insights from the multiscaling autocorrelation pattern. Energy Economics, 32, 993-1000. Arouri, M.H., Dinh, T.H., Nguyen, D.K. (2010). Time-varying predictability in crude-oil markets: The case of GCC countries, Energy Policy, 38, 4371-4380. Belaire-Franch, J., Contreras, D.,2004. Ranks and signs-based multiple variance ratio tests. Working paper, Department of Economic Analysis, University of Valencia. Caner, M., Gedik, E., Keçebaş, A. (2011). Investigation on thermal performance calculation of two type solar air collectors using artificial neural network. Expert Systems with Applications, 38(3), 1668–1674. Charles, A., Darné, O. (2009). The efficiency of the crude oil markets: Evidence from variance ratio tests. Energy Policy, 37, 4267-4272. Elder, J., Serletis, A. (2008). Long memory in energy futures prices. Review of Financial Economics, 17, 146-155. Fama, E.F. (1970). Efficient capital markets: a review of theory and empirical work. Journal of Finance, 25, 383-417. Fama, E.F. (1991). Efficient capital markets: II. Journal of Finance, 46, 1575-1617. Ghaffari, A., Zare, S. (2009). A novel algorithm for prediction of crude oil price variation based on soft computing. Energy Economics, 31, 531-536. Haidar, I., Kulkarni, S, Pan, H. (2008). Forecasting model for crude oil prices based on artificial neural networks. Proceedings of the International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP ‘2008), 103-108. Jammazi, R., Aloui, C. (2012). Crude oil price forecasting: Experimental evidence from wavelet decomposition and neural network modeling. Energy Economics, 34(3), 828-841. Kaastra, I., Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10, 215-236. Kim, J.H., 2006. Wild bootstrapping variance ratio tests. Economics Letters, 92, 38–43.
14
Kulkarni, S., Haidar, I. (2009). Forecasting model for crude oil price using artificial neural networks and commodity futures prices. International Journal of Computer Science & Information Security, 2(1), 81-88. Lackes, R., Börgermann, C., Dirkmorfeld, M. (2009). Forecasting the Price Development of Crude Oil with Artificial Neural Networks. Lecture Notes in Computer Science, 5518, 248255. Maslyuk, S., Smyth, R. (2008). Unit root properties of crude oil spot and futures prices, Energy Policy, 36, 2591-2600. Ortiz-Cruz, A., Rodriguez, E., Ibarra-Valdez, C., Alvarez-Ramirez, J. (2012). Efficiency of crude oil markets: Evidences from informational entropy analysis. Energy Policy, 41, 365373. Rosieka, S., Batllesa, F. J. (2010). Modelling a solar-assisted air-conditioning system installed in CIESOL building using an artificial neural network. Renewable Energy, 35(12), 2894– 2901. Shambora, W.E., Rossiter, R. (2007). Are there exploitable inefficiencies in the futures market for oil?. Energy Economics, 29, 18-27. Tabak, B.M., Cajueiro, D.O. (2007). Are the crude oil markets becoming weakly efficient over time? A test for time-varying long-range dependence in prices and volatility. Energy Economics, 29, 28-36. Wright, J.H., 2000. Alternative variance-ratio tests using ranks and signs. Journal of Business and Economic Statistics, 18, 1–9. Yonaba, H., Anctil, F., Fortin, V. (2010). Comparing sigmoid transfer functions for neural network multistep ahead streamflow forecasting. Journal of Hydrologic Engineering, 15(4), 275– 283. Zhang, B. (2013). Are the crude oil markets becoming more efficient over time? New evidence from a generalized spectral test. Energy Economics, 40, 875-881. Zhang, G., Patuwo, E. B., & Hu, M. Y. (1998). Forecasting with artificial neural network: The state of the art. International Journal of Forecasting, 14, 35-62.
15