Normal Approximation to Binomial Distributions

Normal Approximation to Binomial Distributions Charlie Vollmer Department of Statistics Colorado State University Fort Collins, CO [email protected]...
Author: Shauna Byrd
3 downloads 2 Views 256KB Size
Normal Approximation to Binomial Distributions Charlie Vollmer Department of Statistics Colorado State University Fort Collins, CO [email protected] September 18, 2015

Abstract This document is a supplement to class lectures for STAT 307-003, Fall, 2015. It details how the Normal Distribution can approximate the Binomial Distribution as the number of trials, n, gets large. How large does n need to be? How well does the Normal Distribution approximate a Binomial Distribution?

Let us find out....

1

Setup: Defining some terms

1.1

Expected Value

If we go to wikipedia, the following is the very first sentence that we’ll see: In probability theory, the expected value of a random variable is intuitively the long-run average value of repetitions of the experiment it represents. Great! It is simply what we expect to see most often if we did something over and over and over again! And if we go down a few more sentences on the wikipedia page, we find something even more useful: The expected value is also known as the expectation, mathematical expectation, EV, mean, or first moment. Bam! Look at that fourth synonym: the mean! That is exactly what I would expect to see most often if I did an experiment over and over and over lots of times! Note: if a Random Variable is Binomially Distributed, its mean is: np.

1

1.2

Standard Error, SE

If we go to wikipedia, the following is the first two sentences that we’ll see: The standard error (SE) is the standard deviation of the sampling distribution of a statistic,[1] most commonly of the mean. The term may also be used to refer to an estimate of that standard deviation, derived from a particular sample used to compute the estimate. Ok, this is wordy but it’s actually very accurate and descriptive. It’s just saying that the standard error, SE, is the standard deviation of our statistic. So... if our statistic is: Sn =

n X

Xi

i=1

Where Xi is simply a 1 or 0, in the case of a coin toss (heads or tails), then our SE of this statistic is its standard deviation. Now, we know -from class- that this statistic, Sn , is a Binomially Distributed Random Variable (it follows a Binomial Distribution). In the case of a binomial, we (humans... and now you, too!) know that the variance of a Binomially Distributed Random Variable is simply: npq Do you remember how to find the standard deviation from the variance? Well, if you need the SE, it’s just the standard deviation. So, now we know how to get the variance from a binomial, which means we have the standard deviation or -in other words- the standard error, SE!

2

2

The Approximation: Toss a coin 100 times

The object of this section is to illustrate how if we plot out the histogram of the number of heads from a coin toss will be well approximated by a Normal Distribution as the number of tosses, n gets “large.” For instance, say that we toss a fair coin 100 times and see how many times that we get heads. We could do this and get 88 heads. We could also do it and get 45 heads. Let’s say we do the entire experiment (toss the coin 100 times) 50 times. Thus, we’ll get 50 different numbers. Let’s see what that plot looks like:

6

count

4

2

0 40

45

50

55

60

65

Number of Heads

It looks like on one experiment we got 40 heads and one time we got 41 heads. On another experiment we got 61 heads. Yet again, on 6 experiments we got 48 heads, and 6 more experiments we got 59 heads. You get the picture. So, in this situation, we only did this experiment (toss a coin 100 times) 50 times. And the plot above shows our results from those 50 experiments.

What happens if we did this experiment 100 times? Or a thousand times?? Let us see...

3

Below, we see what happens when we do this experiment 500, 1000, and 5000 times:

0.075

density

density

0.10 0.050

0.05 0.025

0.00

0.000 40

45

50

55

60

40

50

fifty

60

thousand

0.08 0.075

density

density

0.06

0.04

0.050

0.025

0.02

0.00

0.000 40

50

60

30

five_hundred

40

50

60

70

five_thousand

Whoa! We see that our histograms start to look like a bell curve! Clearly, this is no coincidence! This is because a Binomial Random Variable begins to look like a Normally Distributed Random Variable as the number of trials, n, grows large! Careful!! Take notice that we did NOT increase n yet, only the number of times that we did the experiment! So, now if we increase n, we would expect to see this bell-shaped-looking curve actually start to look more and more like a Normal Distribution. As of now, you can notice that it doesn’t quite look like a normal distribution, but rather just a similar-looking curve.

4

3

The Approximation: Toss a coin 100 times

Now, we do the same thing as above, but each experiment is tossing the coin 1000 times. What do you think this does to the Expected Value? Ponder this question: Is it easier to get all heads if I only toss the coin 10 times? Would it be harder to get all heads if I tossed the coin 1000 times? These questions have us think about the expected value and the standard error. As we do more and more trials, do we expect the mean of our sample to get closer to the “true mean” more often? So, let’s do the experiment where we toss the coin 1,000 times. And let’s do this experiment 50 times, as we did before. These are our results:

6

count

4

2

0 460

480

500

520

Number of Heads

And we see that it’s centered around 500 heads, as per our intuition of the outcome, and goes from around 450 heads in some experiments to about 550 heads in others. Does it look -upon quick glance- that it’s about the same as our first plot??

5

4

Examine the difference between n = 100 and n = 1000:

As per the first section of these notes, we know what the variance of a Binomially Distributed Random variable is: npq. So, if we look at our two different situations, we see that our variances/standard deviations are: V ar(Sn ) = npq = 100 ×

1 1 × = 25 2 2

in our first context of n = 100, and we have: V ar(Sn ) = npq = 1000 ×

1 1 × = 250 2 2

in our second context of n = 1000. Careful! What we care about is our standard error, SE! We actually have that our standard errors are: r p 1 1 √ √ V ar(Sn ) = npq = 100 × × = 25 = 5 2 2 in our first context of n = 100, and we have: r p 1 1 √ √ V ar(Sn ) = npq = 1000 × × = 250 ≈ 15 2 2 in our second context of n = 1000. Take a second to examine this further... this is actually striking! We know that most (95%) of our data will lie between 2 standard deviations (standard errors) in the context of a Normal Distribution. And here, that means in our first context that most will lie between 40 and 60 heads, while it will be between 470 and 530 in the second. However!... An interval of length 20 is actually 20% of the possible values of the first context (since we could get anywhere between 0 and 100 heads in 100 coin tosses) and an interval of length 60 is only 6% of the possible values in the second context (since we could get anywhere between 0 and 1000 heads if we flip a coin 1000 times). That means our distribution is MUCH tighter about the mean when we made 1000 tosses (as n got larger) than when we only made 100 tosses.

6

5

Visualize 1000 tosses:

Let’s see what it looks like when we do the 1000 toss experiment many times. Below is for 50, 500, 1000, and 5000 experiments of 1000 tosses: 0.08 0.03

density

density

0.06

0.04

0.02

0.01

0.02

0.00

0.00 475

500

525

450

fifty

475

500

525

550

thousand

0.03

0.02

density

density

0.02

0.01 0.01

0.00

0.00 450

475

500

525

480

five_hundred

520

560

five_thousand

The important thing to look at is the five thousand experiment plot in the lower-right corner. If we compare this to the same plot in the previous 100-toss experiment, this should look more similar to a Normal Distribution. Let’s see as n gets even larger...

7

As n gets larger and larger: We see what happens when n = 10, 000 below: 1250

1000

count

750

500

250

0 498000

499000

500000

501000

502000

Number of Heads

And again for n = 100, 000: 4000

3000

count

6

2000

1000

0 497000

498000

499000

500000

501000

502000

Number of Heads

And this looks pretty Normal to me!

Note: In fact... we can check that this is extremely close to a Normal Curve.

8

7

Is n = 1000 a good Approximation?

If we perform the n = 1, 000 experiment many, many times, we can actually get a good idea of how well it is approximated by a Normal Distribution. We plot the 1000-toss experiment done 100,000 times below:

density

0.02

0.01

0.00 450

500

550

heads

What does this show us? Well... if we have a random variable that follows a Binomial Distribution where the n is at least 1,000... that we find that it is almost a Normal Distribution! This is a very important discovery of ours! Careful! Recall that a Normal Distribution is defined by two things: its mean and variance. If that’s all we need, the mean and variance... well, we’re gold! We have both of those things!

9