Predicting ForeX values using Reservoirs

Predicting ForeX values using Reservoirs Bas van Stein LIACS s0800279 University of Leiden [email protected] Tom Groentjes LIACS s0231347 University o...
1 downloads 5 Views 911KB Size
Predicting ForeX values using Reservoirs Bas van Stein LIACS s0800279 University of Leiden [email protected]

Tom Groentjes LIACS s0231347 University of Leiden [email protected]

December 7, 2011

Abstract In this paper we show the results of the Reservoir computing Algorithm [1] that we made in Python [2] using the Oger [3] toolbox. We tested our Algorithm with different parameters and datasets. We will show the results and explain them.

1

Introduction

want bigger sets then 2000 points we merged a few sets that had adjacent time slices. We also tried to use a day For the assignment we had to implement a reservoir and an hour as timestep but then we were not able to with the Oger toolbox in Python. We used an example get much more then 2000 points with the disadvantage python program from the toolbox, the Mackey Glass that some gaps occurred in the timeseries. example to start with. Reservoirs are a kind of artificial neural networks that can learn time series. They We pruned the data we got by deleting the time are very good to learn signals with a rather fixed pe- information and we only preserved the closing values. riod. We finally choose to fed the algorithm a time series of 4000 points. The data ranges from 6-15-2011 to 6-182011 and we used a test set of 400 points that comes 2 Implementation after the time slice of the training set. We choose to fed the algorithm a time serie of 4000 points. The data The first big challenge was to understand what the ranges from 6-15-2011 to 6-18-2011 and we used a test python example was doing and were and how he got set of 400 points that comes after the timeslice of the the data. After we understood the example program training set. The results then looked reasonable, howwe began creating our FX data set. We took the EU- ever far from optimal. We also tried to tune many of R/USD set from the Forex website with the timestep the algorithm variables like the grid-search and the size being set at one minute and 2000 points. Because we of the reservoir. However the optimization on the grid1

search takes some time so we could not try too big Our program can be run using the below command values. in the terminal and provided that the datasets are in We also tried to feed the reservoir with data from the same folder. timeseries in Juli 2008,2009 and 2010 and then run the test on 2011 but that did not seem to work since the values differ to much from each other.

3

>

python FXreservoirBT.py

Results

(a) Mackey Glass Dataset

(b) 4000 minute data with 50 nodes

(c) 4000 minute data with 500 nodes

Figure 1: Results

2

(a) 4000 day data with 500 nodes

Figure 2: Results continued.. We also ran some tests by altering the leak parameter and the grid-search parameters but the results were all so bad that we did not include them into the report.

4

Conclusion

We can conclude that a reservoir algorithm that we used is not very suitable for stock predictions. The number of nodes in the network did not seem to really matter for the results. The amount of data and the time steps are very important for the outcome. We can see that when using minutes the algorithm can not really see any periodic behavior. The program performed best on our daily test set from 2011. The prediction we make kinda follows the actual target signal, but only as an approximation of the global direction of the target.

5

Appendix

In this section you will find the code of our Reservoir. 1 2 3 4 5 6 7 8 9 10 11

import Oger import p y l a b import s c i p y import c s v import s y s import mdp import random

def

f o r e x e u r o ( s a m p l e l e n =2000 ,

t e s t =0) :

3

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

’’’ f o r e x e u r o ( s a m p l e l e n =1000 , t a u =17 , s e e d = None , n s a m p l e s = 1 ) −> i n p u t Get t h e F o r e x Euro / D o l l a r t i m e− s e r i e s . P a r a m e t e r s a r e : − s a m p l e l e n : l e n g t h o f t h e t i m e− s e r i e s i n t i m e s t e p s . D e f a u l t i s 1 0 0 0 . − n s a m p l e s : number o f s a m p l e s t o g i v e ’’’ c s v . QUOTE NONE samples = [ ] if test > 0 : f = open ( ’ 2 0 1 1 - t e s t s e t . c s v ’ , ’ r ’ ) if test < 1 : f = open ( ’ 2 0 1 1 - t e s t s e t . c s v ’ , ’ r ’ ) t e l l e r = 1; t e l l e r 2 =0; i n p 0 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) ) i n p 1 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) ) i n p 2 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) ) i n p 3 = mdp . numx . z e r o s ( ( s a m p l e l e n , 1 ) ) try : reader = csv . reader ( f ) f o r row i n r e a d e r : i f t e l l e r > 0 and t e l l e r < s a m p l e l e n + 1 : i n p 0 [ t e l l e r 2 ] = f l o a t ( row [ 0 ] ) if test < 1 : #i n p 0 [ t e l l e r 2 ] = ( f l o a t ( row [ 0 ] ) − 0 . 7 2 ) ∗ 10 i n p 1 [ t e l l e r 2 ] = i n p 0 [ t e l l e r 2 ] + ( random . random ( ) − 0 . 5 ) ∗ 0 . 0 0 1 i n p 2 [ t e l l e r 2 ] = i n p 0 [ t e l l e r 2 ] + ( random . random ( ) − 0 . 5 ) ∗ 0 . 0 0 0 1 #i n p 3 [ t e l l e r 2 ] = f l o a t ( row [ 0 ] ) teller2 = teller2 + 1 teller = teller + 1 s a m p l e s . append ( [ i n p 0 ] ) if test < 1 : s a m p l e s . append ( [ i n p 1 ] ) s a m p l e s . append ( [ i n p 2 ] ) #s a m p l e s . append ( [ i n p 3 ] ) finally : f . close () return s a m p l e s

if

name

== " _ _ m a i n _ _ " :

f r e e r u n s t e p s = 700 begin training = 0 begin test = 0 t r a i n i n g s a m p l e l e n g t h = 4000 t e s t s a m p l e l e n g t h =4700 n training samples = 3 print ’ C r e a t i n g t r a i n i n g s e t . . . ’ t r a i n s i g n a l s = f o r e x e u r o ( s a m p l e l e n=t r a i n i n g s a m p l e l e n g t h , t e s t =0) print ’ d o n e . ’ print ’ C r e a t i n g t e s t i n g s e t . . . ’ #t r a i n s i g n a l s = Oger . d a t a s e t s . m a c k e y g l a s s ( s a m p l e l e n=t r a i n i n g s a m p l e l e n g t h , n s a m p l e s=n t r a i n i n g s a m p l e s ) #t e s t s i g n a l s = Oger . d a t a s e t s . m a c k e y g l a s s ( s a m p l e l e n=t e s t s a m p l e l e n g t h , n s a m p l e s =1) t e s t s i g n a l s = f o r e x e u r o ( s a m p l e l e n=t e s t s a m p l e l e n g t h , t e s t =1) print ’ d o n e . ’

print ’ C r e a t i n g r e s e r v o i r . . . ’ r e s e r v o i r = Oger . n o d e s . L e a k y R e s e r v o i r N o d e ( o u t p u t d i m =500 , l e a k r a t e = 0 . 4 , i n p u t s c a l i n g = . 1 , r e s e t s t a t e s =F a l s e ) r e a d o u t = Oger . n o d e s . R i d g e R e g r e s s i o n N o d e ( ) Oger . u t i l s . e n a b l e w a s h o u t ( Oger . n o d e s . R i d g e R e g r e s s i o n N o d e , 4 0 0 ) print ’ d o n e . ’ #r e a d o u t . r i d g e p a r a m = 0 . 0 1 2 5 8 9 2 5 4 1 1 7 9 # 3 1 6 . 2 2 7 7 6 6 0 1 7 f l o w = Oger . n o d e s . F r e e r u n F l o w ( [ r e s e r v o i r , r e a d o u t ] , f r e e r u n s t e p s = f r e e r u n s t e p s ) gridsearch parameters = {readout :{ ’ r i d g e _ p a r a m ’ :

10 ∗∗

s c i p y . a r a n g e ( −5 ,

3,

. 3 ) }}

# I n s t a n t i a t e an o p t i m i z e r l o s s f u n c t i o n = Oger . u t i l s . t i m e s l i c e ( r a n g e ( t r a i n i n g s a m p l e l e n g t h − f r e e r u n s t e p s , u t i l s . nrmse ) o p t = Oger . e v a l u a t i o n . O p t i m i z e r ( g r i d s e a r c h p a r a m e t e r s , l o s s f u n c t i o n ) print ’ o p t i m i z i n g . . . ’ # Do t h e g r i d s e a r c h opt . g r i d s e a r c h ( [ [ ] , t r a i n s i g n a l s ] , print ’ g r i d s e a r c h o p t i m i z i n g ’

flow ,

b i a s s c a l i n g =.2 ,

training sample length ) ,

c r o s s v a l i d a t e f u n c t i o n =Oger . e v a l u a t i o n . l e a v e o n e o u t )

# Get t h e o p t i m a l f l o w and r u n c r o s s −v a l i d a t i o n w i t h o p t f l o w = f l o w #o p t . g e t o p t i m a l f l o w ( v e r b o s e=True )

it

print ’ F r e e r u n o n t e s t _ s i g n a l s s i g n a l w i t h t h e o p t i m a l f l o w . . . ’ opt flow . train ( [ [ ] , train signals ]) freerun output = opt flow . execute ( t e s t s i g n a l s [ 0 ] [ 0 ] ) pylab . p l o t ( s c i p y . concatenate ( ( t e s t s i g n a l s [ 0 ] [ 0 ] [ − 2 ∗ f r e e r u n s t e p s : ] ) ) ) p y l a b . p l o t ( s c i p y . c o n c a t e n a t e ( ( f r e e r u n o u t p u t [ −2 ∗ f r e e r u n s t e p s : ] ) ) ) pylab . x l a b e l ( ’ T i m e s t e p ’ ) pylab . legend ( [ ’ T a r g e t s i g n a l ’ , ’ P r e d i c t e d s i g n a l ’ ] )

4

Oger .

101 102 103

pylab . a x v l i n e ( pylab . xlim ( ) [ 1 ] − f r e e r u n s t e p s + 1 , print o p t f l o w [ 1 ] . ridge param p y l a b . show ( )

pylab . ylim ( ) [ 0 ] ,

pylab . ylim ( ) [ 1 ] ,

c o l o r=’ r ’ )

References [1] Mantas Lukosevicius, Herbert Jaeger: Reservoir Computing Approaches to Recurrent Neural Network Training, School of Engineering and Science, Germany. Computer Science Review, 2009, 127 - 149 [2] Python, 2009. Python Programming Language, Available from: http://www.python.org [3] Reservoir computing - Oger toolbox http://organic.elis.ugent.be/printpdf/book/export/html/265, 2011

5