Properties of the Sample Mean

Properties of the Sample Mean Consider X1, . . . , Xn independent and identically distributed (iid) with mean µ and variance σ 2. n 1P ¯ X= Xi n i=1 ...
37 downloads 0 Views 192KB Size
Properties of the Sample Mean Consider X1, . . . , Xn independent and identically distributed (iid) with mean µ and variance σ 2. n 1P ¯ X= Xi n i=1

(sample mean)

Then n 1P ¯ (X) = µ=µ n i=1 n σ2 1 P 2 ¯ σ = var(X) = 2 n i=1 n

Remarks: ◦ The sample mean is an unbiased estimate of the true mean. ◦ The variance of the sample mean decreases as the sample size increases. ◦ Law of Large Numbers: It can be shown that for n → ∞ n 1P ¯ X= Xi → µ.

n i=1

Question: ◦ How close to µ is the sample mean for finite n? ◦ Can we answer this without knowing the distribution of X?

Central Limit Theorem, Feb 4, 2003

-1-

Properties of the Sample Mean Chebyshev’s inequality Let X be a random variable with mean µ and variance σ 2. Then for any ε > 0 ¡ ¢ σ2 |X − µ| > ε ≤ 2 . ε Proof: Let 1{|xi − µ| > ε} =

½

1 0

if |xi − µ| > ε otherwise

Then o n n (x − µ)2 P i > 1 p(xi ) 1{|xi − µ| > ε} p(xi ) = 1 ε2 i=1 i=1 n (x − µ)2 P σ2 i ≤ p(x ) = i ε2 ε2 i=1 n P

Application to the sample mean: ³ ´ 3σ 3σ ¯ ≤ µ + √ ≥ 1 − 1 ≈ 0.889 µ− √ ≤X n n 9 However: Known to be not very precise iid

Example: Xi ∼ N (0, 1)

n 1P ¯ X= Xi ∼ N (0, n1 )

n i=1

Therefore ³ ´ 3 3 ¯ − √ ≤ X ≤ √ = 0.997 n

Central Limit Theorem, Feb 4, 2003

n

-2-

Central Limit Theorem Let X1, X2, . . . be a sequence of random variables ◦ independent and identically distributed ◦ with mean µ and variance σ 2.

For n ∈

define n √ X¯ − µ Xi − µ 1 P =√ . Zn = n σ

n i=1

σ

Zn has mean 0 and variance 1. Central Limit Theorem For large n, the distribution of Zn can be approximated by the standard normal distribution N (0, 1). More precisely, ³ ´ √ X¯ − µ lim a≤ n ≤ b = Φ(b) − Φ(a), σ

n→∞

where Φ(x) is the standard normal probability Φ(z) =

Z

z

f (x) dx, −∞

that is, the area under the standard normal curve to left of z. Example: ◦ U1, . . . , U12 uniformly distributed on [ 0, 12). ◦ What is the probability that the sample mean exceeds 9? ³√ ¯ ´ U −6 ¯ (U > 9) = 12 √ > 3 ≈ 1 − Φ(3) = 0.0013 12

Central Limit Theorem, Feb 4, 2003

-3-

Central Limit Theorem 0.4

1.0

U[0,1],n=1

0.8 density f(x)

0.3 density f(x)

Exp(1),n=1

0.2

0.1

0.6 0.4 0.2

0.0

0.0 −3

−2

−1

0

1

2

3

−3

−2

−1

0

1

U[0,1],n=2

0.4

2

3

Exp(1),n=2 0.5 0.4 density f(x)

density f(x)

0.3

0.2

0.3 0.2

0.1 0.1 0.0

0.0 −3

−2

−1

0

1

2

3

−3

−1

0

1

0.5

U[0,1],n=6

0.4

−2

2

3

Exp(1),n=6

0.4 density f(x)

density f(x)

0.3

0.2

0.1

0.2 0.1

0.0

0.0 −3

−2

−1

0

1

2

3

−3

U[0,1],n=12

0.4

−2

−1

0

1

2

3

Exp(1),n=12

0.4

0.3

0.3 density f(x)

density f(x)

0.3

0.2

0.1

0.2

0.1

0.0

0.0 −3

−2

−1

0

1

2

3

−3

−2

−1

0

1

U[0,1],n=100

0.4

2

3

Exp(1),n=100 0.4 density f(x)

density f(x)

0.3

0.2

0.1

0.3 0.2 0.1

0.0

0.0 −3

−2

−1

0

Central Limit Theorem, Feb 4, 2003

1

2

3

−3

−2

−1

0

1

2

3

-4-

Central Limit Theorem Example: Shipping packages Suppose a company ships packages that vary in weight: ◦ Packages have mean 15 lb and standard deviation 10 lb. ◦ They come from a arge number of customurs, i.e. packages are independent. Question: What is the probability that 100 packages will have a total weight exceeding 1700 lb? Let Xi be the weight of the ith package and T =

100 P

Xi .

i=1

Then T − 1500 lb 1700 lb − 1500 lb √ > √ (T > 1700 lb) = 100 · 10 lb 100 · 10 lb µ ¶ T − 1500 lb √ = >2 100 · 10 lb µ



≈ 1 − Φ(2) = 0.023

Central Limit Theorem, Feb 4, 2003

-5-

Central Limit Theorem Remarks • How fast approximation becomes good depends on distribution of Xi’s: ◦ If it is symmetric and has tails that die off rapidly, n can be relatively small. iid Example: If Xi ∼ U [0, 1], the approximation is good for n = 12. ◦ If it is very skewed or if its tails die down very slowly, a larger value of n is needed. Example: Exponential distribution. • Central limit theorems are very important in statistics. • There are many central limit theorems covering many situations, e.g. ◦ for not identically distributed random variables or ◦ for dependent, but not “too” dependent random variables.

Central Limit Theorem, Feb 4, 2003

-6-

The Normal Approximation to the Binomial Let X be binomially distributed with parameters n and p. Recall that X is the sum of n iid Bernoulli random variables, X=

n P

Xi ,

i=1

iid

Xi ∼ Bin(1, p).

Therefore we can apply the Central Limit Theorem: Normal Approximation to the Binomial Distribution ¡ ¢ For n large enough, X is approximately N np, np(1 − p) distributed: ¢ ¡ ¡ 1 1 a ≤ X ≤ b) ≈ a − 2 ≤ Z ≤ b + 2 where

¡ ¢ Z ∼ N np, np(1 − p) . Rule of thumb for n: np > 5 and n(1 − p) > 5. In terms of the standard normal distribution we get µ ¶ 1 ¡ − np a − 12 − np b + p a ≤ X ≤ b) = ≤ Z0 ≤ p 2 np(1 − p) np(1 − p) µ ¶ µ ¶ b + 21 − np a − 12 − np =Φ p −Φ p np(1 − p) np(1 − p) where Z 0 ∼ N (0, 1).

Central Limit Theorem, Feb 4, 2003

-7-

The Normal Approximation to the Binomial Bin(1,0.5)

1.0

0.8

0.6

0.6 p(x)

0.8

p(x)

Bin(1,0.1)

1.0

0.4

0.4

0.2

0.2

0.0

0

1

2

3

4

5

6

7

8

9

10

12

14

16

18

0.0

20

0

1

2

3

4

5

6

7

8

9

x

10

12

14

16

18

20

x

Bin(2,0.5)

1.0

0.8

0.6

0.6 p(x)

0.8

p(x)

Bin(5,0.1)

1.0

0.4

0.4

0.2

0.2

0.0

0

1

2

3

4

5

6

7

8

9

10

12

14

16

18

0.0

20

0

1

2

3

4

5

6

7

8

9

x

10

12

14

16

18

20

x

Bin(5,0.5)

0.5

0.4

0.3

0.3 p(x)

0.4

p(x)

Bin(10,0.1)

0.5

0.2

0.2

0.1

0.1

0.0

0

1

2

3

4

5

6

7

8

9

10

12

14

16

18

0.0

20

0

1

2

3

4

5

6

7

8

9

x

10

12

14

16

18

20

x

Bin(10,0.5)

0.3

Bin(20,0.1)

0.3

p(x)

0.2

p(x)

0.2

0.1

0.0

0.1

0

1

2

3

4

5

6

7

8

9

10

12

14

16

18

0.0

20

0

1

2

3

4

5

6

7

8

9

x

10

12

14

16

18

20

x

Bin(20,0.5)

0.3

Bin(50,0.1)

0.3

p(x)

0.2

p(x)

0.2

0.1

0.0

0.1

0

1

2

3

4

5

6

7

8

9

10

12

x

Central Limit Theorem, Feb 4, 2003

14

16

18

20

0.0

0

1

2

3

4

5

6

7

8

9

10

12

14

16

18

20

x

-8-

The Normal Approximation to the Binomial Example: The random walk of a drunkard Suppose a drunkard executes a “random” walk in the following way: ◦ Each minute he takes a step north or south, with probability 21 each. ◦ His successive step directions are independent. ◦ His step length is 50 cm. How likely is he to have advanced 10 m north after one hour? ◦ Position after one hour: X · 1 m − 30 m ◦ X binomially distributed with parameters n = 60 and p =

1 2

◦ X is approximately normal with mean 30 and variance 15: (X · 1 m − 30 m > 10 m) = (X > 40)

≈ (Z > 39.5) µ ¶ Z − 30 9.5 √ = >√ 15 15 = 1 − Φ(2.452) = 0.007

Z ∼ N (30, 15)

How does the probability change if he has same idea of where he wants to go and steps north with probability p = 23 and south with probability 31 ?

Central Limit Theorem, Feb 4, 2003

-9-

Suggest Documents