Monthly Rainfall and Runoff time series analysis

Applied Time Series Analysis Project Monthly Rainfall and Runoff time series analysis Name: Haibin Li Dept: Environmental Sciences May/11/2003 ...
1 downloads 0 Views 356KB Size
Applied Time Series Analysis Project

Monthly Rainfall and Runoff time series analysis

Name:

Haibin

Li

Dept: Environmental Sciences May/11/2003

1. Introduction Not just in hydrology but also for atmospheric science, how to correctly forecast rainfall and runoff up to now still a hot topic and problematic though many efforts have been devoted. Undoubtedly, the appearances and developments of semi- and distributed hydrological models greatly improve the predictability and facilitate us to better understand those fundamental processes. We have to realize unfortunately that large amount of parameterization techniques and observational data as prerequisites for these models to run successfully, to some extent, limit the model only to water-balance stations where the data mentioned above can be available. Here, as an attempt, I hope with the help of time series models, the estimations will become comparatively simple and at the same time with considerable precision preserved. Secondly, by time series models, I want to find whether during different period when the influences of human influence on climate and water cycle are different the interdependent relationship of runoff will change accordingly. Next I will briefly mentioned the data I used, in Section 2 and 3, the univariate analysis of rainfall and runoff time series will be given, in Section 4, a multivariate time series model was explored for the rainfall and runoff data. Finally, several conclusions was presented in Section 5.

Figa. Schematic map of Wuding River Basin(37oN~39oN,108oE~111oE) and station of Baijiachuan

The rainfall and runoff data I used here come from hydrologic station of Baijiachuan at Wuding River Basin (available at http://www.envsci.rutgers.edu/~hli/ts.html), a tributary of Yellow River, which is characteristic with high sand loading. The data spans from 1961~1997 with monthly rainfall (units: mm) and runoff (units: m3/s). For this area, over 80% precipitation falls in summer and autumn while the inter-annually seasonal distributions of runoff are comparatively even due to the high percent of the groundwater recharge (around 50%). Figb. gives the monthly mean, maximal and minimal values of rainfall(right) and runoff (left). Basic statistics for Monthly Runoff

Runoff(M3/s) 200 180 160 140 120 100 80 60 40 20 0 1

Basic statistics for Monthly Rainfall

Rainfall(MM) 350

Mean

300

Min

250

Mean Min

Max

200

Max

150 100 50 0

2

3

4

5

6

7

8

9

10 11 12

1

2

3

4

5

6

7

8

9

10 11 12

Figb. Left: Monthly statistics for runoff, right: the same as left but for rainfall

2. Rainfall Data Analysis Firstly, the time series of original monthly rainfall data was given in Fig1, to make the data more stationary, I mean less seasonal variable, a cubic transformation was taken, and then the transformed rainfall data seems to loosely follow the normal distribution though a tail at lower part existed and the data can be approximately treated as stationary.

Fig1. top-left: the original monthly rainfall data with unit of mm, top-right: the data after cubic transformation, middle-left and right: acf and pacf of the transformed data, obviously some kind of seasonal cycle still existed after transformation; bottom-left: the QQ-plot of the transformed data, bottom-right: the spectrum plot of transformed data, at frequency 0.08, i.e., 12 months seasonal cycle there is a significant peak.

From the smoothed spectrum plot In Fig1., it was found at frequency 0.08, i.e., 1/0.08=12 month, there is a significant peak, which means an annual cycle existing for the rainfall data which is reasonable as a consequence of influences from natural atmospheric forcing. In order to remove the seasonal cycle, firstly a classical method (Bras et al. 1985) for was used to normalize the data set by the monthly values. Z (t ) =

X (t ) − µ (t ) σ (t )

(1)

Where µ (t ) and σ (t ) separately are estimated monthly mean and standard deviation.

Though the diagnostic plots(Fig2.) proves the final selected MA(3) process based on AIC kind of good, but the predictions converge to global mean value very quickly just after 3 or 4 points, thus it makes the forecasted data to be a little boring.

Fig2. The diagnostic plots(left) for the simulated MA(3) process for the normalized data, the prediction based on simulated MA(3) model, just after a few points, the prediction converge to the global mean value.

Then differencing was tried to remove the seasonal cycle as well as to make prediction. According to the spectrum of transformed data in Fig1., difference with lag 12 was tried to remove the seasonal component, the acf and pacf of the differenced data imply that a MA process or a mixed ARMA process may be appropriate, after trying different sets of p and q, still a MA(3) process was selected, the coefficients and equations were given as following:

Dt = Rain(t ) − Rain(t − 12) Dt = et − 0.9054565 × e(t − 1) − 0.1425365 × e(t − 2) + 0.965601 × e(t − 3)

(2)

At the same time, the diagnostic plots and residuals spectrum was given in Fig3., from Fig3, the selected model is good, and the residuals basically satisfy white noise. At next step in Fig4. I give the plots for fitted data vs. the original process and some prediction based on the fitted MA(3) model. It was found the simulated model generally can reproduce the seasonal cycle of the original data (Fig4.) though less variable than the real measurements, i.e., for those extreme values the model still can’t give good representation, in fact those extreme values are hard to simulate because

usually they are caused by short time external forcing, for example, comparatively unusual temperature or water vapor changes. Of course, some improvement definitely can be made with the help of other related forcing variables such as corresponding radiation data and temperature etc, unfortunately, in my experiment, they are still not available. And the prediction also seems reasonable and fairly good.

Fig3.Left: The diagnostic plots of simulated MA(3) model by differencing the transformed data right: spectrum of residuals of MA(3) model.

Fig4. The fitted process and the prediction based on the simulated MA(3) model—given by equation (2)

3. Runoff Data Fig5. gives the runoff time series, at first sight, it was noticed there may exist point of change, with Mann-Kendall test, a changing point was spotted around 1972(Wang, 2003) which cause the monthly average runoff after and before that time decrease nearly 15 m3/s. In order to simulate the runoff process, the original data was divided into two periods: the period before 1972(named as base period) and the period from 1972 till 1997(named as change period). The spectrum plot looks pretty interesting, unexpected, a 6 months cycle is more prominent than 12 months annual cycle which is hard to explain. And also there is another 3 months peak as a result of harmonic. Firstly, I tried 6 months difference which proves not efficient enough to remove the seasonal cycle (Fig6. left panel), then as usual 12 months differencing was used to remove the seasonal cycle, the acf and pacf (Fig6. right panel) imply 12 months of differencing is ok to remove the seasonal cycle, And I also noticed that if carefully

check the runoff2, spectrum implies except the several dominant peaks, in fact, it is more like a white noise process. The acf and pacf give me an idea that for the base period, some AR process might apply; while for the change period, it is totally a white noise. After the differencing, it seems runoff2 is really a white noise, so there is no meaning to model it. Now I will concentrate on simulating the base period with some kind of model. Finally AR(13) process was selected with the minimal AIC values. The model was given as below: Dt = Runoff 1(t ) − Runoff 1(t − 12) Dt = −0.61 + 0.38D(t − 1) − 0.02 D(t − 2) + 0.03D(t − 3) + 0.01D(t − 4) + 0.06 D(t − 5) − 0.04 D(t − 6) + 0.07 D(t − 7) − 0.07 D(t − 8) − 0.07 D(t − 9) − 0.11D(t − 10) + 0.04 D(t − 11) − 0.66 D(t − 12) + 0.45 D(t − 13) + et

(3)

In fact, from the above equation, only values at the lag1, lag12 and lag13 contribute much to the final fitted or prediction, the values at other lags are not so significant. Fig7. gives the diagnostic plots which attests the model is good and the right panel of Fig7. plots the fitted values vs. the observations, it was found the model can reproduce the right seasonal cycle, but can’t model the very high volume peak and also after the very high peak value, the model usually gives overestimation against the observations.

Fig5. Top: times series of runoff with unit: m3/s, bottom-left and right: spectrum for base period and change period

Fig6. The top panel is the acf and pacf for base period, the bottom panel is the acf and pacf for change period

Fig7. Left: Diagnostic plots for the AR(13) model, right: fitted values vs. observations

4. Bivariate Analysis of the rainfall and runoff With the hope to improve the simulation of runoff combined the possible interdependence of rainfall and runoff, the rainfall data was also divided into two parts corresponding to the two period of runoff. The ccf plots(Fig8.) imply that only at lag 0 there existed some kind of relationship between runoff and rainfall separately for base period and change period, bootstrap methods also imply that only at lag 0 that the rainfall-runoff has some kind of relationship existed(not shown here). The coherence and phase plots give no other hints too.

Fig8. top-left: acf and ccf for Runoff(RO) and Rain(O) for base period. Top-right: same as left part for change period, bottom-left: coherence and phase for base period RO and R, bottom-right: same as left for change period.

Different models were tried to model the runoff with the rainfall data. Finally for the base period, ARMAX model simulated as below: Dt = Runoff 1(t ) − Runoff 1(t − 12) D1t = Rain1(t ) − Rain1(t − 12) Dt = −0.26 + 0.38D(t − 1) + 0.01D(t − 2) − 0.004 D(t − 3) + 0.01D(t − 4) + 0.02 D(t − 5) − 0.08D(t − 6) + 0.07 D(t − 7) − 0.02 D(t − 8) − 0.05D(t − 9)

(4)

− 0.04 D(t − 10) − 0.02 D(t − 11) − 0.68D(t − 12) + 0.46 D(t − 13) + 9.08D1t + et

And by t-test, the coefficient for rain is significant different from zero, and at the same time, only the coefficient at lag 1, lag 12 and lag13 of runoff time series is significant different from zero which might mean the runoff at these lags will have much more influence on current runoff than runoff at other lags. And Fig9. gives the diagnostic plots also the fitted values vs. the observations, in fact, we can find including the rainfall data we can improve our ability to predict the runoff greatly. However, due to only have 132 points at base period and 120 participate in the simulation, so I didn’t leave any data points for prediction test. And for the change period, a simple regression model was fitted to the runoff and rain data:

Dt = Runoff 2(t ) − Runoff 2(t − 12) D 2t = Rain 2(t ) − Rain 2(t − 12) Dt = −0.30 + 5.87 D 2t + et

(5)

The fitted values vs. the observations for change period was given in Fig10., as expected, by this kind of simple regression model, it is hard to reproduce the runoff process correctly. And if check with summary.lm, we can find the adjusted R-squared is only 0.1217, i.e., the rainfall if really possible, can only explain 12% variation of the runoff. Definitely, we need some other variables to improve the simulations.

Fig9. Left: the Diagnostic test the ARMAX model(4), Right: Fitted values vs. observational series by model(4)

Fig10. Left: Fitted values vs. observations by equation(5), right: QQ-plot for residuals

5. Conclusion As an attempting to analyze and predict the runoff and rainfall process, for these

particular data selected for my project, a MA(3) process was adopted to simulate the rainfall data and proves to be efficient, and as far as the runoff time series is concerned, due to a point of change exist, the whole period was divided into two parts: base period and change period, according to the corresponding point of change at 1972. For the base period which represents a natural climate influence, an AR(13) model found can be modeled to the data, thus imply in the natural environments when the interference of human influence is not strong and can be negligible, the runoff will generally depend on previous runoff values. However, for the change period, when the human influence takes place frequently and greatly changed the internal interdependent relationship of runoff inherited, only a white noise existed after the seasonal cycle was removed. In order to improve the predictability of runoff, the possible relationship between runoff and rainfall was explored, for both periods, some kind of relationship existing between runoff and rainfall at corresponding time step, in my experiment, at lag 0. For the base period, after introducing the rainfall data, obviously we can improve the prediction of runoff. But for the change period, a simple regression model seems to give no better estimation. In fact, the climate system is very complex and many processes in fact are highly nonlinear dynamics, for example, when rainfall finds its pathway to ground flow through the low-pass filter of soil (Entekhabi et al, 1996), various feedbacks and mutual interactions take place. All these make climate system prediction be a tough task, and the simplest model currently used to predict runoff is a polynomial model including 4 variables (Richard, et al, 1999), and as a consequence, to better predict or explore the physical law underlying runoff or rainfall process, other variables such as evaporation, temperature definitely will be needed to improve our understanding for these fundamental but especially important processes.

Reference:

Bras, R.L. and Rodriguez-Iturbe, I. 1985: Random Functions and Hydrology, New York, Dover. Richard. M. V., Ian, W., Chris, D., 1999: Regional Regression Models of Annual Streamflow for United States, J. Irrigation and Drainage Engineering, May/June, 148-157 Entekhabi, Dara., I. Rodriguez-Iturbe, F. Castelli, 1996: Mutual interaction of soil moisture and atmospheric processes, J. Hydrology, 184, 3-17 Juan Wang, 2003: the spatial and temporal variability of hydrological factors of Wuding River Basin, Master Thesis, Chinese Academy of Sciences