Maximum Likelihood & Method of Moments Estimation

Maximum Likelihood & Method of Moments Estimation Patrick Zheng 01/30/14 1 Introduction Goal: Find a good POINT estimation of population parameter ...
Author: Molly Logan
1 downloads 0 Views 1MB Size
Maximum Likelihood & Method of Moments Estimation Patrick Zheng 01/30/14

1

Introduction Goal: Find a good POINT estimation of population parameter Data: We begin with a random sample of size n taken from the totality of a population. We shall estimate the parameter based on the sample

Distribution: Initial step is to identify the probability distribution of the sample, which is characterized by the parameter. The distribution is always easy to identify The parameter is unknown. 2

Notations Sample: 𝑋 , 𝑋 ,…, 𝑋   Distribution: 𝑋 iid f(x, 𝜃) Parameter: 𝜃

Example e.g., the distribution is normal (f=Normal) with unknown parameter 𝜇 and 𝜎 (𝜃=(𝜇, 𝜎 )). e.g., the distribution is binomial (f=binomial) with unknown parameter p (𝜃= p). 3

It’s  important  to  have  a  good  estimate! The importance of point estimates lies in the fact that many statistical formulas are based on them, such as confidence interval and formulas for hypothesis testing, etc.. A good estimate should 1. 2. 3. 4.

Be unbiased Have small variance Be efficient Be consistent 4

Unbiasedness An estimator is unbiased if its mean equals the parameter. It does not systematically overestimate or underestimate the target parameter. Sample mean(x)/proportion( pˆ ) is an unbiased estimator of population mean/proportion.

5

Small variance We also prefer the sampling distribution of the estimator has a small spread or variability, i.e. small standard deviation.

6

Efficiency An estimator 𝜃 is said to be efficient if its Mean Square Error (MSE) is minimum among all competitors. MSE( ˆ )

E( ˆ where Bias( ˆ )

)2 E( ˆ )

Bias 2 ( ˆ )

v ar( ˆ ),

.

Relative Efficiency(𝜃 , 𝜃 ) =

( (

) )

If >1, 𝜃 is more efficient than 𝜃 . If 30), MLE is unbiased, consistent, normally distributed, and efficient (“regularity  conditions”) “Efficient”  means  it  produces  the  minimum  MSE   than other methods including Method of Moments

More useful in statistical inference.

35

Cons of Method of ML MLE can be highly biased for small samples. Sometimes, MLE has no closed-form solution. MLE can be sensitive to starting values, which might not give a global optimum. Common when 𝜃 is of high dimension

36

How to maximize Likelihood 1.

Take derivative and solve analytically (as aforementioned)

2.

Apply maximization techniques including Newton’s  method,  quasi-Newton method (Broyden 1970), direct search method (Nelder and Mead 1965), etc. These methods can be implemented by R function optimize(), optim() 37

Newton’s  Method a method for finding successively better approximations to the roots (or zeroes) of a real-valued function. Pick an 𝑥 close to the root of a continuous function 𝑓(𝑥) Take the derivative of 𝑓(𝑥) to get 𝑓′(𝑥) Plug into 𝑥

=𝑥 −

( (

) , )

𝑓′(𝑥 )≠ 0

Repeat until converges where 𝑥

≈𝑥 38

Example Solve 𝑒 − 1 = 0 Denote 𝑓(𝑥)= 𝑒 − 1; let starting point 𝑥 = 0.1 𝑓′(𝑥)=𝑒 𝑥

=𝑥 −

( (

) )

:

𝑥 =𝑥 −

= 0.1 −

𝑥 =𝑥 −

=…

. .

= 0.0048374

Repeat until |𝑥 − 𝑥 | < 0.00001, 𝑥 = 7.106 ∗ 10 39

Example:  find  MLE  by  Newton’s  Method In Poisson Distribution, find 𝜆 is equivalent to maximizing ln 𝐿(𝜆) ( )

finding the root of

 

=



−𝑛

Implement  Newton’s  method  here, ( )

define 𝑓(𝜆) = 𝑓′(𝜆) =

𝜆

 

=



−𝑛



=𝜆 −

( (

) )

Given 𝑥 , 𝑥 , … , 𝑥

and 𝜆 ,   we can find 𝜆. 40

Example  cont’d Suppose we collected a sample from Poi(𝜆): 18,10,8,13,7,17,11,6,7,7,10,10,12,4,12,4,12,10,7,14,13,7

Implement  Newton’s  method  in  R:

𝜆

=𝜆 −

𝑓(𝜆 ) 𝑓′(𝜆 )

41

Use R function optim() 𝑓(𝜆) =

∑𝑥 −𝑛 𝜆

Typo! This should be -lnL(lamda).

42

The End! Thank you!

43