Moment Generating Functions and Their Properties +

NCSSM Statistics Leadership Institute Notes The Theory of Inference Moment Generating Functions and Their Properties The i th moment of a random var...
1 downloads 0 Views 73KB Size
NCSSM Statistics Leadership Institute Notes

The Theory of Inference

Moment Generating Functions and Their Properties The i th moment of a random variable Y is defined to be E (Y i ) = µ′ i . So the expected value or mean of Y , E (Y ) , is the first moment E (Y 1 ) . The expected value of Y 2 , E (Y 2 ) , which can be used to find the variance of Y , is the second moment. The moment generating function m(t ) for a random variable Y is defined to be E (etY ) where t is in a small neighborhood of zero. So m(t ) = E (e

tY

F (tY ) ) = E G1 + tY + 2! H

2

I JK

(tY )3 + + ... 3!

since

the

series

F GH

expansion

for

I JK

(tY )2 (tY ) 3 2tY 2 3t 2Y 3 + + ... . Also, m′ + + ... . Setting t = 0 , (t ) = E Y + 2! 3! 2! 3! (0) = E (Y + 0 + 0+ ...) = E (Y ) . So, the first derivative of the moment we have m′ generating function evaluated at t = 0 is the expected value of Y . That is, m′ (0) = E (Y ) = µ . Thus, if you have the moment generating function for a random variable, you can find µ by taking the first derivative and evaluating it at zero. etY = 1 + tY +

Now look at the second derivative:

F m′ (t ) = E GY ′ H

2

I JK

3 ⋅2tY 3 (0) = E (Y 2 ) . ′ + + ... , and m′ 3!

Since V (Y ) = E (Y 2 ) − [ E (Y )]2 , we can use the first and second derivatives of the moment generating function to find the variance of a random variable. Before proceeding we will verify that V (Y ) = E (Y 2 ) − [ E (Y )]2 : V (Y ) = E (Y − µ )2 = E (Y 2 − 2Yµ + µ 2 )

= E (Y 2 ) − E (2 µY ) + E ( µ 2 ) = E (Y 2 ) − 2 µ[ E (Y )] + µ 2 = E (Y 2 ) − 2 µ ( µ ) + µ 2 = E (Y 2 ) − 2 µ 2 + µ 2 = E (Y 2 ) − µ 2 = E (Y 2 ) − [ E (Y )]2 The argument given above essentially repeats the calculation for Cov ( Yi , Y j ) on page 7. Covariance is more general, since Cov (Y , Y ) = Var (Y ). Note that if Y is a discrete random variable,

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

  (ty )2 + ... mY (t ) = E (etY ) = ∑ ety p( y) = ∑ p ( y) 1 + ty + 2!   2 t = ∑ p ( y) + ∑ (ty) p ( y) + ∑ y 2 p ( y) + ... 2! t2 = 1 + t ∑ yp ( y) + y 2 p ( y) + ... since t is constant in the sum ∑ 2! 2 t = 1 + tE ( y) + E ( y 2 ) + ... 2! 2 t = 1 + tµ1′ + µ2′+ ... 2! as we would expect since this is a moment generating function. Example 1

e− λλy for y! y = 0,1,2... ., where λ represents the rate at which something happens. Find the moment generating function and use it to find the mean and variance of Y . Suppose Y is a Poisson random variable with parameter λ. Then p( y) =

Solution: ∞

ety e− λλy y! y=0

First find mY (t ) = E (etY ) = ∑

t (et λ) y = e− λee λ since y! y=0



= e − λ∑



t (et λ) y ∑ y! is the power series for ee λ y =0

= ee λ− λ = eλ( e − 1) t ∴ mY (t ) = eλ( e − 1) Find derivatives of mY (t ) to use for computing µ and σ 2 . t

t

m′ (t ) = eλ( e − 1) ⋅λet E (Y ) = µ = m′ (0) = eλ⋅0λe0 = λ t

m′ (t ) = e λ( e − 1) λet + λet eλ( e − 1) λet ′ t

t

E (Y 2 ) = m′ (0) = e0λe0 + λe0e0λe0 = λ+ λ2 ′

V (Y ) = σ 2 = E (Y 2 ) − [ E (Y )]2 = ( λ+ λ2 ) − λ2 = λ The Poisson distribution has mean and variance of λ. Example 2 Suppose Y has a geometric distribution with parameter p . Show that the moment generating function for Y is m(t ) =

pet where q = 1 − p . 1 − qet

11

NCSSM Statistics Leadership Institute Notes

Solution: p( y ) = q y − 1 p

The Theory of Inference

y = 1, 2,... ∞

mY (t ) = E (etY ) = ∑ ety pq y − 1 y =1



= pq − 1 ∑ (et q ) y

Note: geometric series with common ratio et q < 1 if t < − ln ( q )

y =1

= pq − 1

qet pet = 1 − qet 1 − qet

Now that we know the moment generating function, it is a simple matter to find the mean and variance of the geometric distribution. For the mean we have 1 − qet ) pet − pe t ( − qet ) ( pe0 p 1 pet ′ , so m′ = m 0 = = 2 = . (t ) = ( ) 2 2 2 (1 − qet ) (1 − qet ) (1 − qe0 ) p p For the variance we have

(1 − qe ) ′ m′ (t ) =

t 2

pet − pet ( 2 )(1 − qet )( − qet )

(1 − qe )

t 4

and

′ m′ (0)=

′ so Var (Y ) = m′ ( 0)− ( m′( 0)) = 2

pe0 (1 + qe0 )

(1 − qe )

0 3

=

=

pet (1 + qet )

(1 − qe )

t 3

p (1 + q ) 1 + q = 2 , p3 p

1+ q 1 1 − p − 2 = 2 . p2 p p

Example 3 Find the moment generating function for a random variable with a standard normal distribution. That is, find the moment generating function for Z ~ N (0,1) . Note that Z ~ N (0,1) is read: Z is distributed as a normal random variable with µ = 0 and σ 2 = 1. Solution:

mZ (t ) = E (e ) = tZ

z z ∞

−∞

=



−∞

e

tz

1 e 2π

F− z

1 GH 2 e 2π

2

+

− z2 2 dz 2 tz t 2 t2 − + 2 2 2

I JK dz

12

NCSSM Statistics Leadership Institute Notes

= et = et

2

2

z zd

∞ 2

−∞ ∞ 2

The Theory of Inference

1 − ( z− t )2 2 e dz 2π

i

normal density , µ = t ,σ2 = 1 dz

−∞

= et

2

2

⋅1 = et

2

2

It is straight forward to verify that the mean and variance are 0 and 1, respectively. Example 4 Show that the moment generating function for a random variable Y ~ N ( µ ,σ 2 ) is

mY (t ) = e

µ t+

t 2σ 2 2

. Use the moment generating function to show that E (Y ) = µ .

Solution: Following the outline of Example 3, we have

mY (t ) = E (e ) = tY

z z



1

e

ty

−∞

=



−∞

− ( y − µ)2 1 2 e 2σ dy 2πσ 1

− ( y − µ )2 + 1 2 e 2σ 2πσ

2σ 2ty 2σ 2

dy

Consider the exponent: 1 1 [( y − µ )2 − 2σ2ty ] = − [(( y − µ ) − σ 2t ) 2 − 2σ 2 µt − σ4t 2 ] − 2 2 2σ 2σ So mY (t ) =

z ∞

−∞

=e

µt +

−1

[( y − µ ) − σ 2t ]2 + µt + 1 2 e 2σ 2πσ

z z

σ 2t 2 ∞ 2

−∞

=e

σ 2t 2 2

dy

−1

[( y− µ ) − σ 2t ]2 1 2 e 2σ dy (the integrand is a normal density function) 2πσ

µt +

σ2 t 2 ∞ 2 (normal

µt +

σ t 2

density , mean = µ + σ2t ,var = σ2 ) dy

−∞

=e

2 2

⋅1 = e

µt +

σ 2t 2 2

Now to find the expected value of Y : m′ (t ) = e

µt+

σ2 t2 2

(µ + σ t ) 2

m′ (0) = e 0 ( µ + 0) = µ . ′ The variance of Y is found by computing: m′ (t ) = e

13

µ t+

σ2 t2 2

(σ )+ e 2

µ t+

σ2 t2 2

(µ + σ t ) 2

2

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

′ m′ ( 0 ) = e0 (σ 2 )+ e0 ( µ ) = σ 2 + µ 2 . 2

Then Var ( y ) = σ 2 + µ 2 − µ 2 = σ 2 .

A more direct derivation can also be given:

(

mY (t ) = E ( etY ) = E e (

t µ+ σz)

) = E (e

µt

⋅e(

σ t )z

)= e

µt

mZ (σt ) = eµ t e 2 1

(σ t )2

=e

µ t + 12 σ 2t 2

.

Example 5 Find the moment generating function for a random variable with a gamma distribution. Solution: ∞

1 mY (t ) = E (e ) = ∫ α yα − 1e− y / β ety dy β Γ(α ) 0 tY

z



1

− y( − t ) 1 yα − 1e β dy (Recall that β α and Γ(α ) are constants) = α β Γ(α ) 0

Rewrite the expression

=

1 1− β t 1 − t= = β β  β    1 − β t 

1 β Γ(α ) α

z ∞

− y

FG β IJ α − 1 H1− βt K y e dy

0

Since we have previously shown that

FG H



0

yα − 1e− y / β dy = βα Γ(α ) ,

IJ Γ(α ) K F 1 IJ , if t < 1 . =G H1 − βt K β

1 β = α β Γ(α ) 1 − βt 1 = (1 − βt )α

z

α

α

Recall that the chi-square distribution is a special case of the gamma distribution with β = 2 and α = ν / 2 . So it follows from Example 5 that if Y ~ χ 2 (ν ) , then

F 1 IJ m (t ) = G H1 − 2t K

ν/2

Y

.

From this moment generating function we can find the mean and variance of the chisquare distribution with ν degrees of freedom.

14

NCSSM Statistics Leadership Institute Notes

We have mY′( t ) = −

The Theory of Inference

ν ν − −1 ν 1 − 2 t ( ) 2 ( − 2) = ν + 1 . So 2 1 − 2 t ( )2

m′ ( 0 ) = ν . The mean of a chi-

square distribution with ν degrees of freedom is ν , the degrees of freedom.

′ Also m′ (t ) =

ν 2ν  + 2

(1 −

2t )

 1 .

ν+ 2

′ So m′ (0)=

2

ν 2ν  + 2

(1 −

2t )

 1 

ν+ 2

2

|

t=0

= ν 2 + 2ν and

Var (Y ) = ν 2 + 2ν − ν 2 = 2ν .

The variance is twice the degrees of freedom. Theorem: Consider Y1 , Y2 , ... Yn independent random variables. Let W = Y1 + Y2 + ...+ Yn . n

Then mW (t ) = ∏ mYi (t ) . i =1

e

Proof: mW (t ) = E e ∑ t

Yi

j FGH∏ e = ∏ Ed e

IJ K i

n

=E

tYi

i =1

n

tYi

i =1 n

by laws of exponents since Yi are independent

= ∏ mYi (t ) i =1

Example 6 If Y has a binomial distribution with parameters (n, p) , show that the moment generating

c

function for Y is m(t ) = pet + q

h

n

where q = 1 − p . Then use the moment generating

function to find E (Y ) and V (Y ) . Solution: n 1with probability p Y = ∑ X i where the X's are independent and X i =  i =1 0 with probability q tX i t ⋅1 t ⋅0 t mX i (t ) = E (e ) = e p + e q = pe + q

mY (t ) = mX1 (t ) ⋅mX 2 (t ) ... mX n (t ) n

= ∏ ( pet + q ) 1

= [ pet + q ]n

Now we can take derivatives to find E (Y ) and V (Y ) . 15

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

m′ (t ) = n( pet + q )n − 1 pet (0) = n( p + q)n − 1 p = np E (Y ) = m′

′ m′ (t ) = n(n − 1)( pet + q)n− 2 ( pet )2 + n( pet + q)n− 1 pet

′ E (Y 2 ) = m′ (0) = n(n − 1)( p + q)n − 2 p 2 + n( p + q )n − 1 p

(n

2

− n ) p 2 + np = n 2 p 2 + np (1 − p )

2 2 ∴ V (Y ) = E (Y 2 )−  E (Y )  = n p + np (1 − p )− ( np ) = np (1 − p ) 2

2

The Uniqueness Theorem is very important in using moment generating functions to find the probability distribution of a function of random variables. Uniqueness Theorem: Suppose that random variables X and Y have moment generating functions given by mX (t ) and mY (t ) respectively. If mX (t ) = mY (t ) for all values of t , then X and Y have the same probability distribution. The proof of the Uniqueness Theorem is quite difficult and is not given here. Example 7 Find the distribution of the random variable Y for each of the following momentgenerating functions:

F 1 IJ (a) m(t ) = G H1 − 2t K

(b) m(t ) = 1 3 et + 2 3

et (c) m(t ) = 2 − et

(d) m(t ) = e

8

b g b g

5

d i

2 et − 1

Solution: υ 2

F 1 IJ (a) Since m(t ) can be written in the form G H1 − 2t K

where υ = 16 , Y must be a chi-

square random variable with 16 degrees of freedom.

d

(b) Since m(t ) can be written in the form pet + q

i

n

where p = 1 / 3 and n = 5 , Y must

be a binomial random variable with n = 5 and p = 1 / 3 . 1 t e 2 (c) Since m(t ) can be rewritten as , Y must be a geometric random variable 1 t 1− e 2 with p = 1 / 2 .

FG H FG H

IJ K IJ K 16

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

(d) Since m(t ) can be written in the form eλ( e − 1) , Y must be a Poisson random variable with λ= 2 . t

Theorem: If W = aY + b , then mW (t ) = etb mY (at ) .

c

mW (t ) = E (etW ) = E et ( aY + b )

Proof:

= e E (e bt

atY

h

) = e mY (at ) bt

Theorem: If W = aY + b , then E (W ) = aE (Y ) + b and V (W ) = a 2V (Y ) mW (t ) = ebt mY (at ) mW′(t ) = ebt mY′ (at ) ⋅a + mY (at )ebt ⋅b mW′(0) = mY′ (0) ⋅a + mY (0) ⋅b = aE (Y ) + E (e0⋅Y ) ⋅b = aE (Y ) + E (1) ⋅b = aE (Y ) + b

Proof:

mW′ (t ) = aebt mY′ (at ) ⋅a + mY′ (at ) ⋅aebt ⋅b + mY (at )ebt b2 + bebt mY′ (at ) ⋅a ′ ′ 2 2 mW′ (0) = a mY′ (0) + abmY′ (0) + b mY (0) + abmY′ (0) ′ ′ = a 2 E (Y 2 ) + 2abE (Y ) + b2 So

V (W ) = E (W 2 ) − E (W ) = a 2 E (Y 2 ) + 2abE (Y ) + b 2 − aE (Y ) + b 2

2

= a 2 E (Y 2 ) + 2abE (Y ) + b2 − a 2 ( E (Y )) 2 + 2abE (Y ) + b 2

= a 2 E (Y 2 ) − a 2 E (Y )

o

= a 2 E (Y 2 ) − E (Y )

2

2

t = a V (Y ) 2

Of course, this theorem can be proven without resorting to moment generating functions, but we present the proof to show that mgf's give the familiar results. Theorem: If Z ~ N (0,1) and Y = σZ + µ , then Y ~ N ( µ ,σ 2 ) . Proof: We know that mZ (t ) = et 2 . 2

mY (t ) = etumZ (σt )

= etueσ t

2 2

=e

tu+

σ 2t 2 2

(Applying the theorem proved above)

2

= moment generating function of N ( µ ,σ 2 )

∴ Y ~ N ( µ ,σ2 ) by the uniqueness theorem.

Theorem: If Y1 ~ Poisson ( λ1 ) and Y2 ~ Poisson (λ2 ) and Y1 and Y2 are independent, then W = Y1 + Y2 is Poisson(λ1 + λ2 ) .

17

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

mW (t ) = mY1 (t ) ⋅mY2 (t ) since Y1 and Y2 are independent

Proof:

=e

e jeλ e e − 1j

λ1 et − 1

t

2

b λ + λ gd e − 1i t

=e 1 2 = moment generating function of Poisson( λ1 + λ2 ) ∴ W ~ Poisson( λ1 + λ2 ) by the uniqueness theorem.

Theorem: Let W1 ,W2 ,...,Wn be independent chi-square random variables, each with one n

degree of freedom. Let W = ∑ Wi . Then W ~ χ n 2 . 1

Proof: n

n

i =1

i =1

b

mW (t ) = m W (t ) = ∏ mWi (t ) = ∏ 1 − 2t ∑ i

g

−12

b

= 1 − 2t

g

−n2

= moment generating function of χ 2 with n degrees of freedom ∴ W ~ χ n 2 by the uniqueness theorem.

We have already considered the mean and variance of a chi-square distribution with n degrees of freedom as a special case of the gamma distribution (see page 14). Now we will give a more direct proof using the chi-square generating function. Theorem: If W ~ χ n 2 , then E (W ) = n and V (W ) = 2n. −n

Proof: mW (t ) = (1 − 2t ) 2

−n mW′(t ) = (1 − 2t ) 2

−n −1 2 ⋅( −

2)

−n

−1 −n m′ (0) = (1 − 0) 2 ⋅( − 2) = n 2 So E (W ) = n .

FG − n IJFG − n − 1IJ(1 − 2t ) (− 2)(− 2) H 2 KH 2 K F n IF n I F n n I m′ ′ (0) = 4G JG + 1J = 4G + J = n + 2n H2 KH2 K H 4 2 K −n −2 2

mW′ ′ (t ) =

2

2

W

So V (W ) = E (W 2 ) − [ E (W )]2 = n2 + 2n − n 2 = 2n

b g

Now consider Z ~ N 0,1 and W = Z 2 . We know the distribution of Z , and now we want to show that the distribution of W is chi-square with one degree of freedom. There are at least two ways to do this. The method of density functions is the harder way, and the method of moment generating functions is the easier way. We will show both in the sections that follow.

18

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

bg

Method of Density Functions: Let FW w represent the cumulative distribution function of W for w > 0. FW ( w ) = P ( z 2 ≤w ) = P ( − w ≤ z ≤ w ) w w 2 1 − z2 2 2 = e dz = e− z 2dz − w 0 2π 2π w − z2 2 d 2 d e dz FW ( w ) = fW w = dw 2π dw 0 1 du 2 u − z2 2 and F (u) = = Let u = w . Then e dz . dw 2 w 2π 0 dF 2 d u − z2 2 2 − u2 2 = e dz = e du 2π du 0 2π dF dF du 2 − u2 2 1 = ⋅ = By the chain rule, e dw du dw 2π 2 w

z

z

bg

z

z z

2 − d w i2 2 1 − 1 2 e w 2 2π 1 − w 2 −12 1 e w Note that π = Γ = 2 2π 1 = w1 2 − 1e− w 2 , for w > 0 1  21 2 Γ  2  =

FG IJ HK

bg

Note that fW w is the density function for a Gamma random variable with α =

β = 2 . Therefore W is a chi-square random variable with one degree of freedom. Method of Moment Generating Functions: ∞ 2 1 tz 2 − z 2 2 mW (t ) = E etW = E etZ = e e dz − ∞ 2π

d i e j =

= =

z z z

=



−∞



−∞



−∞

z

1 − 2z b 1− 2t g e dz 2π 2

− z2

1 2b 1− 2 t g− 1 e dz 2π 1 2π 1

b1 − 2 t

LM 1 ⋅ 1 MN b1 − 2t g b1 − 2t g −12

g z



12

−∞

19

b

12

1

2π 1 − 2t

OP e b PQ

2 1− 2 t

− z2

g

−12

− z2

e 2b 1− 2 t g dz −1

g

−1

dz

1 and 2

NCSSM Statistics Leadership Institute Notes

The Theory of Inference

g z

b

∞ 1 2 1 2 − ∞ Normal pdf with µ = 0 and σ = 1 − 2 t 1 − 2t 1 = 1 2 , for t > 0. 1 − 2t

=

b b

g

−1

dz

g

By the uniqueness theorem, this is the moment generating function of a chi-square random variable with one degree of freedom. 2 Now we know that if Z ~ N (0,1) , then Z 2 ~ χ1 . We have previously shown that the sum of n independent chi-square random variables with one degree of freedom is a chisquare random variable with n degrees of freedom. These results lead to the following: Given Y1 , Y2 ,...Yn , independent random variables with Yi ~ N ( µ i , σi 2 ) and Y − µi 2 2 ~ N (0,1) , then ∑ Zi ~ χ n . Zi = i σi

The beauty of moment generating functions is that they give many results with relative ease. Proofs using moment generating functions are often much easier than showing the same results using density functions (or in some other way).

Multivariate Moment Generating Functions Following are several results concerning multivariate moment generating functions: mU ,V (s, t ) = E (esU + tV ) mU ,V (0, t ) = E (etV ) = mV (t ) mU ,V (s ,0) = E (e sU ) = mU ( s )

The following theorem will be important for proving results in sections that follow: Theorem: U and V are independent if and only if mU ,V ( s, t ) = mU (s ) ⋅mV (t ) .

20