Modern Methods of Time-series Analysis for Environmental Systems Zhulu Lin Dept of Crop and Soil Sciences University of Georgia Athens, Georgia October 19, 2005
-1
PO4-P(µg⋅L )
Algal blooms in the Whitehall pond 1500
1000
500
0 05/23 -1
Chlorophyll a (µg⋅ L )
Orthophosphate
(a)
07/11
Chlorophyll a
(b)
200
100
0 05/23
07/11
08/30
10/19
Dissolved oxygen
(c) -1
10/19
300
15 DO (mg⋅ L )
08/30
10
5
05/23
07/11
08/30
Date (May 23 ~ October 16, 2000)
10/19
Pond ecosystem
Topics • Signal processing: – Missing data, outliers; – Signal extraction;
• Data-based Mechanistic (DBM) modeling: – Transfer function (TF) models; – State-dependent Parameter (SDP) modeling;
• Recursive parameter estimation – Time-varying parameter estimation – Model structure identification
1. Signal processing
Time series with missing data 16 Missing Raw data 14
−1
DO (mg⋅ L )
12
10
8
6
4
2 07/01
07/06
07/11
07/16
07/21 Date
07/26
07/31
08/05
Dynamic harmonic regression model
y(k) = T (k) + S(k) + e(k); R
R
i =1
i =1
k = 1,2,..., N
S (k ) = ∑ Si (k ) = ∑ {ai (k ) cos(ωi t ) + bi (k ) sin(ωi t )}
Harmonics of time series 8
Actual AR(14) Zero line
7
6
log10(P)
5
4
3
2
1
0 20
10
6.67
5 4 3.33 Period (samples/cycle)
2.86
2.5
2.22
Interpolation and smoothing 16 Smoothed Raw data 14
−1
DO (mg⋅ L )
12
10
8
6
4
2 07/01
07/06
07/11
07/16
07/21 Date
07/26
07/31
08/05
Signal extraction 15 Trend Time series Trend S.E.
DO (mg⋅ L−1)
(a) 10
5
0 07/01
07/06
07/11
07/16
07/21
07/26
07/31
08/05
07/26
07/31
08/05
Date 4 2
−1
DO (mg⋅ L )
(b)
0 −2 −4 07/01
07/06
07/11
07/16
07/21 Date
Signal extraction of the time-series of the Whitehall pond system
10 8 6 4
Chla pH*2 NH3-N NOx/3 TIN(nh3+nox)
DO Temp PO4-P TOC/100 Par
2 0 0:00 -2
Variable
Peak
Trough
20
PAR
14
2
15
Temp
20
8
10
pH
18-20
6-8
5
DO
18
6
0
NH3-N
6-8
18-20
TON
16
4
TIN
14
2
PO4-P
12
0
TOC
12
0
Chla
22-2
10-14
-5 4:00
8:00
12:00
16:00
20:00
-10
-4
-15
-6
-20
Solar radiation driving pond ecosystem 100
PAR Temp×10 DO×10 pH×100 Chlorophyll−a
Harmonics
50
0
−50 7/5
7/6
7/7
7/8
7/9 Date
7/10
7/11
7/12
Chlorophyll-a vs. nutrients 10 8
4
PO4-P (µg/L)
TOC/100 (µg/L)
TIN (µg/L)
Chla (µg/L)
3
6
2
4
1
2
0
0 8/15
8/16
8/17
8/18
8/19
-1 8/20
-2
-2
-4
-3
-6
-4
2. Data-based Mechanistic (DBM) modeling
General transfer function (TF) model B ( k , z −1 ) y (k ) = u (k − δ ) + ν (k ) −1 A( k , z ) A(k , z −1 ) = 1 + a1 (k ) z −1 + ⋅⋅⋅ + an ( k ) z − n B (k , z −1 ) = b0 (k ) + b1 (k ) z −1 + ⋅⋅⋅ + bm (k ) z − m
To determine: 1. values of n, m, δ; 2. estimates of ai(k), bi(k), i = 0, 1, 2, …, (n, m); 3. nonlinearity exhibited by ai(k), bi(k).
DBM modeling (1)
I/O data sets
Identification of linear constant parameter TF model using SRIV
Nonlinearity?
No
Best linear TF model
Yes
(2)
TVP estimation of simplest TF model using SRIV-FIS State-dependent Parameter (SDP) modeling
Select a variable from NMSS No
(3)
Correlated with TVP? Yes
(4)
Estimation of the relationship using WLS/check physical interpretation
Final nonlinear TF model
15
-1
y(k): Algal Biomass (mg⋅ L )
Nonlinearity of the pond system Output: algal biomass
(a)
10
5
0
u (k ) =
⎛ I (k ) ⎞ I (k ) exp ⎜ 1 − ⎟ IS IS ⎠ ⎝
(b)
Input: light efficiency
u(k): Light Efficiency
1
0.8
0.6
0.4
0.2
0
7/5
7/10
7/20
7/15 Date
7/25
7/30
8/4
TF w/ constant parameters Mode l
Denominato r
Numerato r
Delay
YIC
RT2
AIC
1
1
1
3
-4.9362
0.07941
8.2184
2
2
1
1
-0.6287
0.12888
8.1678
3
2
1
2
-0.4271
0.13313
8.1629
4
1
3
1
0.0996
0.11418
8.1892
5
1
2
2
0.3461
0.08168
8.2206
6
2
1
0
1.7126
0.13852
8.1567
y (k ) =
0.2475(0.4074) u (k − 3) −1 1 − 0.9989(0.0000) z
TF w/ time-varying parameter 0.8
TVP & algal biomass
TVP vs. algal biomass
b0(k|N) y(k)/15 S.E.
0.8
0.7
0.6
Estimated b0(k|N)
0.6
0.4
0.5
0.4
0.3
0.2 0.2
0 0.1
-0.2
7/5
7/10
7/20
7/15
7/25
7/30
8/4
0
0
2
4
Date
8
6 -1
y(k) (mg⋅ L )
bˆ0 (k | N ) y (k ) = u (k − 2) + v(k ) 1 − 0.9650
10
12
14
State-dependent Parameter (SDP) modeling procedure I/O data sets
Identification of linear constant parameter TF model using SRIV
Nonlinearity?
No
Best linear TF model
Yes TVP estimation of simplest TF model using SRIV-FIS State-dependent Parameter (SDP) modeling
Select a variable from NMSS No Correlated with TVP? Yes Estimation of the relationship using WLS/check physical interpretation
Final nonlinear TF model
SDP modeling 0.8
TVP vs. algal biomass
TVP vs. algal biomass
0.7 0.8
FIS Estimate of SDP
Estimated b0(k|N)
0.6
0.5
0.6
Sorted in ascending order
0.4
0.3
0.4
0.2
0.2 0
0.1
0
-0.2
0
2
4
8
6 -1
y(k) (mg⋅ L )
10
12
14
0
2
4
6 *
8 -1
y (k) (mg⋅ L )
10
12
14
SDP modeling (cont’d) FIS Estimate of SDP
0.8
0.8
0.6
0.4
Model I
0.2
-0.2
0.4
*
14
12
10
8
6
4
2
0
-1
y (k) (mg⋅ L )
0.2
1.5
b0(1-82|N)
(a)
0
-0.2
0
2
4
6 *
8
10
12
14
1
b0 =0.1167y
0.5
0
-1
y (k) (mg⋅ L )
4
3
2
1
7
6
5
8
10
9
11
y(1-82) (mg/L) 1.5
(b)
b0(82-432|N)
FIS Estimate of SDP
0
0.6
1
0.5
b0 =0.0806y
0
-0.5
0
2
4
6
8
y(82-432) (mg/L)
10
12
14
Model II
Nonlinear TF models 14
Algal biomass
12
-1
u '(k ) = µθ T ( k ) − 20 y (k ) f ( I )
aˆ1 = −1.720, aˆ2 = 0.755; bˆ0 = 1.581 µˆ = 0.056,θˆ = 1.0166
8
6
4
2
0
7/5
7/10
u '(k ) = µ2θ
T ( k ) − 20
y (k ) f ( I ), k ≥ 82.
7/20
7/25
Algal biomass
12
7/30
-1
8
6
4
µˆ1 = 0.117, µˆ 2 = 0.0806,θˆ = 1.012
2
0
8/4
Observations Forecasting Simulation
10 Algal Biomass(mg⋅L )
Model II 9
7/15
Date
14
1.903 y (k ) = u '(k − 2) −1 1 − 0.925 z u '(k ) = µ1θ T ( k ) − 20 y (k ) f ( I ), k < 82,
Observations Forecasting Simulation
10
Algal Biomass (mg⋅ L )
Model I
b0 y (k ) = u '(k − 1) 1 + a1 z −1 + a2 z −2
07/03
07/08
07/13
07/18 Date
07/23
07/28
08/02
3. Recursive parameter estimation
Parameter estimation schemes Batch (off-line) scheme
ˆ0 α t0
t1
t2
tN ˆ1 α
Parameter Estimate Update
Recursive (on-line) scheme ˆ 0( t0 ) α
t0
ˆ 0( t1 ) α
t1
ˆ ( t2 ) α 0
t2
ˆ 0( t N ) α
tN ˆ 1( t0 ) α
Recursive estimation αˆ (tk ) = αˆ (tk −1 ) + K (tk )[ y (tk ) − yˆ (tk −1 )] Update Estimate
Previous Estimate
Weighting Factor
Prediction Error
Two applications: 1. Model structure identification 2. Time-varying parameter (TVP) estimation
1. Model structure identification A model is an approximate of real system! x1(t)
Model:
u(t)
x2(t)
α
x3(t)
y(t)
x1(t) System:
u(t) x2(t)
y(t) x3(t)
Structural difference
Model:
x1(t) u(t)
⍺α(t) x3(t)
x2(t)
x1(t) System:
x2(t)
x4(t)
x3(t)
y(t)
Recursive estimation identifies the structural difference Recursive estimation algorithm
x1(t) [u(t), y(t)] Observations
Model x2(t)
α
x3(t)
Pond system conceptualization (1) (15)
(2) (3)
DO
Algae
(5)
Duckweed
(4)
CO2
(6)
(7) (9)
PO4−P (8) (13)
16
14
HCO3−
(11) (10)
(12)
Simulated Observed
(15) (16)
OP
12
-1
OH− Alk
(14) DO(mg⋅L )
CO32−
10
8
k6
6
4
Duckweed 2
07/03
07/08
07/13
07/18
Date
07/23
07/28
k5(t)
DO
08/02
k4
k1 k2
Algae
k3(t)
RPE estimation of states 15
Algal biomass
(a)
Prediction Observations
x1(t)(mg/L)
10
5
0 07/01
07/06
07/11
07/21
07/16
07/26
07/31
08/05
20
DO
(b)
Date
x2(t)(mg/L)
15
Prediction Observations
10
5
0 07/01
07/6
07/11
07/16
07/21 Date
07/26
07/31
08/05
RPE estimation of parameters 1.5
3
1.5 07/01 3
2
-1
07/21
1
Grazing/settling
07/11
07/21
Aerobic reaction/SOD
07/11
07/21
07/31
Other sources of DO
0
-0.5 07/01
07/11
07/21
Duckweed
-1
-1
k5(mg⋅L ⋅d )
4
Date
2
0
-2 07/01
07/31
0.5
0
6
1
0 07/01
07/31
-1
-1 -1
07/11
1
-1 07/01
Respiration/death rate
0.5
2
k4(d )
-1
k1(d )
2.5
k3(mg⋅L ⋅d )
k2(d )
Maximum specific growth rate
07/11
07/21
07/31
07/31
Improved model structure Recursive estimation (1)
(2) (3)
(15)
DO
(5) Algae
Duckweed
CO2
(4)
(6)
(7) HCO3−
(9)
PO4−P (8)
(11) (10)
(12)
CO32−
(13) OP
(15) OH−
(16) (14)
Alk
Simulation results Algal biomass
10
-1
5
0
07/03
07/08
07/13
07/18
07/23
07/28
DO
15
Simulated Observed
DO(mg⋅L )
-1
Algae(mg⋅L )
15
10
5
0
08/02
07/03
07/08
07/13
Date 11
Duckweed biomass
8
07/18
07/23
07/28
08/02
Date
pH
Simulated
Simulated Observed
10
pH
6
9
4 -3
2
07/13
07/18
07/23
07/28 -1
07/08
OP(mg⋅L )
07/03
Date
7
8
x 10
Organic-P
Simulated
7
08/02
07/03
5 4 3
07/03
07/08
07/13
07/18
07/23
07/28
08/02
Date 2
PO4-P
1.5
Simulated Observed
1 0.5 0 -0.5
07/08
07/13
07/18
Date
6
-1
0
8
PO4-P(mg⋅L )
-1
Duckweed(mg⋅L )
10
Simulated Observed
07/03
07/08
07/13
07/18
Date
07/23
07/28
08/02
07/23
07/28
08/02
2. Time-varying parameter estimation Involving 2 steps, 2 models: Step 1: predicting parameter estimates from parameter model, usually GRW
αˆ (tk | tk −1 ) = Fαˆ (tk −1 | tk −1 ) Step 2: correcting parameter prediction from Step 1 by state variable model, DBM or TBM
αˆ (tk | tk ) = αˆ (tk | tk −1 ) + K (tk )[ y (tk ) − yˆ (tk | tk −1 )]
Signal processing revisited: Interpolation and smoothing of time series 4500 4000 3500
PO3− (µ g/l) 4
3000 2500 2000 1500 1000 500 0 0
5
10
15
20 Day
25
30
35
Dynamic harmonic regression model State model:
y (k ) = T (k ) + S (k ) + e(k ); R
R
i =1
i =1
k = 1, 2,..., N
S (k ) = ∑ Si (k ) = ∑ {ai (k ) cos(ωi t ) + bi (k ) sin(ωi t )}
TVPs:
T (k ), ai (k ) 's, bi (k ) 's
TVP model: ⎛ T (k ) ⎞ ⎛ T (k − 1) ⎞ ⎜ ⎟ ⎜ ⎟ = − a ( k ) a ( k 1) F i i ⎜ ⎟ ⎜ ⎟ + Gξ, ξ ~ (0, Σ) ⎜ b (k ) ⎟ ⎜ b (k − 1) ⎟ ⎝ i ⎠ ⎝ i ⎠
Harmonics (determine R) Actual AR(27) Zero line
6
5
log10(P)
4
3
2
1
0 20
10
6.67
5 4 3.33 Period (samples/cycle)
2.86
2.5
2.22
Interpolation and smoothing 4500 4000 3500
2500
4
PO3− (µ g/l)
3000
2000 1500 1000 500 0 0
5
10
15
20 Day
25
30
35
Signal extraction 1000
Trend 500
0
0
5
10
15
20
25
30
35
25
30
35
25
30
35
500
Double daily oscillation 0
−500
0
5
10
15
20
400
Daily oscillation
200 0 −200 −400
0
5
10
15
20
Day