0.
Analyzing Periodic Data: Statistical Perspective C.N.R. Rao Lecture By Debasis Kundu Department of Mathematics & Statistics Indian Institute of Technology Kanpur
March 06, 2013
0.
Outline
1
Introduction
2
Basic Formulation
3
Preliminaries
4
Different Estimation Procedures
5
Higher Dimensional Model
1. Introduction
Outline
1
Introduction
2
Basic Formulation
3
Preliminaries
4
Different Estimation Procedures
5
Higher Dimensional Model
1. Introduction
Introduction
We observe periodic phenomena everyday in our lives. For example the number of tourists visiting the famous Taj Mahal, the daily temperature of Delhi or the ECG data of a normal human being clearly follow periodic pattern. Sometimes the data may not be exactly periodic but it is nearly periodic. Our aim is to analyze such periodic/ nearly periodic data.
1. Introduction
Question?
1 2
What is a periodic data? Why do we care to analyze?
1. Introduction
What is a periodic data?
We do not give the formal definition. But informally speaking
1. Introduction
What is a periodic data?
We do not give the formal definition. But informally speaking 1
it shows a repeated (periodic) pattern in one dimension.
1. Introduction
What is a periodic data?
We do not give the formal definition. But informally speaking 1 2
it shows a repeated (periodic) pattern in one dimension. it shows a symmetric (periodic) pattern in higher dimension.
1. Introduction
Why do we want to analyze?
1. Introduction
Why do we want to analyze?
1
Theoretical reason.
1. Introduction
Why do we want to analyze?
1 2
Theoretical reason. Prediction purposes.
1. Introduction
Why do we want to analyze?
1 2 3
Theoretical reason. Prediction purposes. Compression purposes.
1. Introduction
Example: Airlines Passenger Data
400 300 200
x(t) −−−−>
500
600
Airline passengers data
0
20
40
60 t −−−−>
80
1. Introduction
Example: Brightness of Variable Star Data 35 30 25 20 15
y(t)
10 5 0
0
100
200
t
300
400
500
600
1. Introduction
Example: Vowel Sound Data ’uuu’ 3000
2000
1000
0
y(t)−1000 −2000
−3000
0
100
200
t
300
400
500
600
1. Introduction
Example: ECG Data of a Normal Human 700 Original Signal 600
500
400
300
200
100 y(m) 0
−100
−200
0
100
200
300 m
400
500
600
1. Introduction
Example: Two Dimension Periodic Data
1. Introduction
Example: Three Dimension Periodic Data
1. Introduction
Example: Three Dimension Periodic Data
2. Basic Formulation
Outline
1
Introduction
2
Basic Formulation
3
Preliminaries
4
Different Estimation Procedures
5
Higher Dimensional Model
2. Basic Formulation
Simplest Periodic Function
The simplest periodic function is the sinusoidal function, and it can be written in the following form: y (t) = A cos(ωt) + B sin(ωt) The period of y (t) is the shortest time taken for y (t) to repeat itself, and it is 2π/ω.
2. Basic Formulation
Smooth Periodic Function
In general a smooth periodic function (mean adjusted) with period 2π/ω, can be written in the form: y (t) =
∞ X
[Ak cos(ωkt) + Bk sin(ωkt)] ,
k=1
and it is well known as the Fourier expansion of y (t).
2. Basic Formulation
Extracting Parameters
From y (t), Ak and Bk can be obtained uniquely. πAj Z 2π/ω if j ≥ 1 ω cos(jωt)y (t)dt = 2πA0 0 if j = 0 ω and
Z
0
2π/ω
sin(jωt)y (t)dt =
πBj . ω
2. Basic Formulation
Noisy Periodic Function
Most of the times y (t) is corrupted with noise, so we observe the following: y (t) =
∞ X
[Ak cos(ωt) + Bk sin(ωt)] + X (t),
k=1
where X (t) is the noise component.
2. Basic Formulation
Practical Model
It is impossible to estimate infinite number of parameters. Hence the model is approximated by the following model: y (t) =
p X k=1
for some p < ∞.
[Ak cos(ωk t) + Bk sin(ωk t)] + X (t),
2. Basic Formulation
Model
The model has two components,
2. Basic Formulation
Model
The model has two components, 1
Deterministic component
2. Basic Formulation
Model
The model has two components, 1 2
Deterministic component Random component
2. Basic Formulation
Aim
The aim is to extract (estimate) the deterministic component µ(t), where p X [Ak cos(ωk t) + Bk sin(ωk t)] , µ(t) = k=1
in presence of the random error component X (t), based on the available data y (t), t = 1, . . . , N.
2. Basic Formulation
Problem Formulation
Based on the available data {y (t); t = 1, . . . , N},
2. Basic Formulation
Problem Formulation
Based on the available data {y (t); t = 1, . . . , N}, 1 Deterministic Component Determine (estimate) p Determine (estimate) A1 , . . . , Ap , B1 , . . . , Bp Determine (estimate) ω1 , . . . , ωp .
2. Basic Formulation
Problem Formulation
Based on the available data {y (t); t = 1, . . . , N}, 1 Deterministic Component Determine (estimate) p Determine (estimate) A1 , . . . , Ap , B1 , . . . , Bp Determine (estimate) ω1 , . . . , ωp . 2
Random Component Estimate X (t)
2. Basic Formulation
Procedure
2. Basic Formulation
Procedure
1
Assume certain structure on X (t)
2. Basic Formulation
Procedure
1 2
Assume certain structure on X (t) Estimate the deterministic component µ(t)
2. Basic Formulation
Procedure
1 2 3
Assume certain structure on X (t) Estimate the deterministic component µ(t) Estimate the error X (t)
2. Basic Formulation
Procedure
1 2 3 4
Assume certain structure on X (t) Estimate the deterministic component µ(t) Estimate the error X (t) Verify the assumption.
2. Basic Formulation
Procedure
1 2 3 4 5
Assume certain structure on X (t) Estimate the deterministic component µ(t) Estimate the error X (t) Verify the assumption. If the assumption is satisfied then stop the process, otherwise go back to step 1.
3. Preliminaries
Outline
1
Introduction
2
Basic Formulation
3
Preliminaries
4
Different Estimation Procedures
5
Higher Dimensional Model
3. Preliminaries
Random Variable
X is called a random variable if it takes certain values with a given probability. It can be written as follows: Z b f (x)dx. P(a < X < b) = a
The function f (x) is known as the probability density function of X
3. Preliminaries
Gaussian Random Variable
X is called a Gaussian random variable with mean 0 and variance 1, if 1 2 f (x) = √ e −x /2 ; 2π
−∞ < x < ∞
3. Preliminaries
Gaussian PDF 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
−4
−2
0
2
4
3. Preliminaries
Linear Equations
Suppose we want to solve the following linear equation: Ax = b Suppose A is an m × n (for m > n) matrix, x is a n × 1 and b is a m × 1 vector. The least squares solution of x is b x = AT A
−1
AT b.
3. Preliminaries
Non-Linear Equation: Newton-Raphson
We want to solve the following linear equation: f (x) = 0, Suppose b x is a solution of the equation f (b x ) = 0. x (k+1) = x (k) −
f (x (k) ) f ′ (x (k) )
4. Different Estimation Procedures
Outline
1
Introduction
2
Basic Formulation
3
Preliminaries
4
Different Estimation Procedures
5
Higher Dimensional Model
4. Different Estimation Procedures
Periodogram Estimators
The most used and popular estimation procedure is the periodogram estimators. The periodogram at a particular frequency is defined as !2 !2 N N 1 X 1 X I (θ) = y (t) cos(θt) + y (t) sin(θt) N t=1 N t=1 !2 !2 N N 1 X 1 X ≈ µ(t) cos(θt) + µ(t) sin(θt) N t=1 N t=1
4. Different Estimation Procedures
Periodogram Estimator
Consider the following sinusoidal signal: Sinusoidal Example 1: y (t) = 3.0(cos(0.2πt)+sin(0.2πt))+3.0(cos(0.5πt)+sin(0.5πt))+X (t) Here X (t)’s are i.i.d. N(0,0.5)
4. Different Estimation Procedures
Examples: Sinusoidal Signal
5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0
0
0.5
1
1.5
2
2.5
3
4. Different Estimation Procedures
Periodogram Estimator
Consider the following sinusoidal signal: Sinusoidal Example 2:
y (t) = 3.0(cos(0.2πt)+sin(0.2πt))+0.25(cos(0.5πt)+sin(0.5πt))+X (t) Here X (t)’s are i.i.d. N(0,2.0)
4. Different Estimation Procedures
Examples: Sinusoidal Signal
4.5 4 3.5 3 2.5 2 1.5 1 0.5 0
0
0.5
1
1.5
2
2.5
3
4. Different Estimation Procedures
Least Squares Estimators
Assuming p is known, the most natural estimators will be the least squares estimators and they can be obtained as follows: n X t=1
y (t) −
"
p X k=1
#!2
Ak cos(ωk t) + Bk sin(ωk t)
4. Different Estimation Procedures
Numerical Issues 1
It is a highly non-linear problem. The least squares surface has several local minima.
2
Most of the time the standard Newton-Raphson algorithm may not converge.
3
Even if they converge, often it converges to the local minimum rather than the global minimum.
4
If p is large, it becomes a higher dimensional optimization problem, extremely accurate initial guesses are required for any iterative procedure to work well.
4. Different Estimation Procedures
Least Absolute Deviation Estimators
Assuming p is known, the LAD estimators can be obtained by minimizing: " p # n X X Ak cos(ωk t) + Bk sin(ωk t) y (t) − t=1
k=1
4. Different Estimation Procedures
Sequential Estimation Procedures It is based on the facts that the components are orthogonal and it works like this First minimize n X t=1
(y (t) − A cos(ωt) − B sin(ωt))2
with respect to A, B and ω. Take out their effect from y (t), i.e. consider b cos(b b sin(b y˜ (t) = y (t) − A ω t) − B ω t)
Repeat the procedure p times.
4. Different Estimation Procedures
Advantage
It reduces the computational burden significantly. For example if p = 25, instead of solving a 25 dimensional optimization problem, we need to solve 25 one dimensional optimization problems. It does not have any problem about initial guess or convergence. It produces the same accuracy as the least squares estimators.
4. Different Estimation Procedures
Super Efficient Estimators When p = 1, the Newton-Raphson algorithm will be of the following form: Q ′ (ω) ω (j+1) = ω (j) − ′′ Q (ω) It has been suggested ω (j+1) = ω (j) −
1 Q ′ (ω) 4 Q ′′ (ω)
It not only converges, it produces estimators which are better than the least squares estimators.
4. Different Estimation Procedures
Main Theoretical Results 1
Least squares estimators are consistent under mild assumptions on the errors.
2
Least squares estimators have the convergence rate N −3/2 .
3
Sequential estimators have the same convergence rate as the least squares estimators.
4
Asymptotic variances of the super efficient estimators are smaller than the least squares estimators.
5
Periodogram estimators are consistent, but it has the convergence rate N −1/2 .
4. Different Estimation Procedures
Estimation of p
It is a difficult problem. We still do not have a satisfactory solution.
4. Different Estimation Procedures
Estimation of p
1 2 3 4 5
Consider the number of peaks of the periodogram function. It can be very misleading. In the least squares procedure, consider residual sums of squares. It can be very misleading too. Information theoretic criterion.
4. Different Estimation Procedures
Information Theoretic Criterion AIC (k) = n ln Rk + 2(3k) BIC (k) = n ln Rk +
1 ln n(3k) 2
EDC (k) = n ln Rk + Cn k. Here Rk =
n X t=1
"
y (t) −
k X j=1
# 2 b j cos(b bj sin(b A ωj t) + B ωj t)
Choose that model for which AIC (k), BIC (k) or EDC (k) is minimum
5. Higher Dimensional Model
Outline
1
Introduction
2
Basic Formulation
3
Preliminaries
4
Different Estimation Procedures
5
Higher Dimensional Model
5. Higher Dimensional Model
Two-Dimensional Model
y (m, n) =
p X
[Ak cos(θk m + ωk n) + Bk sin(θk m + ωk n)] + X (m, n),
k=1
for some p < ∞.
5. Higher Dimensional Model
Three-Dimensional Model
y (m, n, s) =
p X
µk (m, n, s) + X (m, n, s),
k=1
for some p < ∞, where
µk (m, n, s) = Ak cos(θk m + ωk n + λk s) + Bk sin(θk m + ωk n + λk s).
5. Higher Dimensional Model
Thank You