Probability Distributions and Statistics

8 Probability Distributions and Statistics • Distributions of Random Variables • Expected Value • Variance and Standard Deviation • Binomial Distrib...
1 downloads 1 Views 99KB Size
8

Probability Distributions and Statistics

• Distributions of Random Variables • Expected Value • Variance and Standard Deviation • Binomial Distribution • Normal Distribution • Applications of the Normal Distribution Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Random Variable A random variable is a rule that assigns a number to each outcome of a chance experiment. • Finite discrete – variable can assume only finitely many values. • Infinite discrete – variable can assume infinitely many values that may be arranged in a sequence. • Continuous – variable can assume values that make up an interval of real numbers. Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Distribution for the Random Variable X A probability distribution for a random variable X: x –8 P(X = x) 0.13

–3 0.15

–1 0.17

0 0.20

1 0.15

4 6 0.11 0.09

Find

a. P ( X ≤ 0 )

0.65

b. P ( −3 ≤ X ≤ 1)

0.67 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Students from a small college were asked how many charge cards they carry. X is the random variable representing the number of cards and the results are below. x 0 1 2 3 4 5 6

#people 12 42 57 24 9 4 2

P(x =X) 0.08 0.28 0.38 0.16 0.06 0.03 0.01

Probability Distribution

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Histograms A way to represent a probability distribution of a random variable graphically. Credit card results: x 0 1 2 3 4 5 6

P (X = x )

P(x =X) 0.08 0.28 0.38 0.16 0.06 0.03 0.01

0.4 0.3 0.2 0.1 0 0

1

2

3

4

5

6

Number of Cards Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Mean The average (mean) of the n numbers x1, x2 , ..., xn is x where x = x1 + x2 + ... + xn n

Median The median is the middle value in a set of data that is arranged in increasing or decreasing order. For an even number of data points the median is the average of the middle two.

Mode The mode is the number that occurs most frequently in a set of data. Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. The quiz scores for a particular student are given below: 22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18 Find the mean, median and mode. Mean:

sum of entries 273 = = 21 number of data points 13

Median:

Middle number = 20

12, 18, 18, 20, 20, 20, 20, 22, 24, 24, 25, 25, 25 Mode (most frequent):

20 (occurs 4 times) Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Expected Value of a Random Variable X Let X denote a random variable that assumes the values x1, x2, …,xn with associated probabilities p1, p2, …, pn, respectively. Then the expected value of X, E(X), is given by E ( X ) = x1 p1 + x2 p2 + ... + xn pn

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Use the data below to find out the expected number of credit cards that a student will possess. x = # credit cards x

P(x =X)

0 1 2 3 4 5 6

0.08 0.28 0.38 0.16 0.06 0.03 0.01

E ( X ) = x1 p1 + x2 p2 + ... + xn pn = 0(.08) + 1(.28) + 2(.38) + 3(.16) + 4(.06) + 5(.03) + 6(.01)

=1.97 About 2 credit cards Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Jackson and Max are playing a dice game where a single die is rolled. Jackson pays Max $2 for rolling a 1, 2, 3, or 4 and Max pays Jackson $D for a 5 or 6. Determine the value of D if the game is to be fair. 4 2 P (Jackson loses) = = 6 3

We want the expected value of the game to be zero to be fair: Jackson loss

⎛ 2⎞ ⎛1⎞ ( −2 ) ⎜ ⎟ + ( D ) ⎜ ⎟ = 0 ⎝ 3⎠ ⎝ 3⎠

−4 + D = 0 D=4

D should be $4

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Odds If P(E) is the probability of an event E occurring, then 1. The odds in favor of E occurring are given by the ratio P(E) P(E) E occurs 1− P ( E )

=

( )

P EC

E doesn’t occur

2. The odds against E occurring are given by the ratio C P E E doesn’t occur 1− P ( E ) P(E)

=

( )

P(E)

E occurs

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. If the news has just announced that the probability of rain is 0.65 (65%), find a. the odds in favor of rain

P(E)

.65 .65 13 = = = 1 − P ( E ) 1 − .65 .35 7 b. The odds against rain

1− P ( E ) P(E)

.35 7 = = .65 13

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability of an Event (Given Odds) If the odds in favor of an event E occurring are a to b, then the probability of E occurring is

a P(E) = a+b Ex. The odds that the horse Gluebound will win a particular race are 2 to 16. Find the probability that Gluebound wins the race.

2 2 1 a = = = P ( win ) = a + b 2 + 16 18 9 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Variance Variance is a measure of the spread of the data. The larger the variance, the larger the spread. Suppose a random variable has the probability distribution x P(X = x)

x1 p1

x2 p2

x3 p3

… …

xn pn

and expected value E ( X ) = µ The variance of a random variable X is defined by: Var( X ) = p1 ( x1 − µ ) + p2 ( x2 − µ ) + ... + pn ( xn − µ ) 2

2

2

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Standard Deviation Standard deviation is a measure of the spread of the data using the same units as the data. The standard deviation of a random variable X is defined by:

σ = Var( X ) =

p1 ( x1 − µ ) + p2 ( x2 − µ ) + ... + pn ( xn − µ ) 2

2

2

Where each xi denotes the value assumed by the random variable X and pi is the probability associated with xi. Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. The quiz scores for a particular student are given below: 22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18 Find the variance and standard deviation. Value Frequency Probability

12 18 1 2 .08 .15

20 4 .31

22 1 .08

24 2 .15

25 3 .23

The expected value µ ≈ 21 Var( X ) = p1 ( x1 − µ ) + p2 ( x2 − µ ) + ... + pn ( xn − µ ) 2

2

2

σ = Var( X ) Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Var( X ) = .08 (12 − 21) + .15 (18 − 21) 2

2

+ .31( 20 − 21) + .08 ( 22 − 21) 2

2

+ .15 ( 24 − 21) + .23 ( 25 − 21) 2

2

Var( X ) = 13.25

σ = Var( X ) = 13.25 ≈ 3.64

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Chebychev’s Inequality Let X be a random variable,

P ( µ − kσ ≤ X ≤ µ + kσ ) ≥ 1 −

1 k

2

where µ = the expected value

σ = the standard deviation

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. A probability distribution has a mean of 40 and a standard deviation of 12. Use Chebychev’s inequality to estimate the probability that an outcome of the experiment lies between 22 and 58.

P (10 ≤ X ≤ 70 ) Notice that µ − kσ = 10

k = 2.5

µ + kσ = 70 1

1

21 P (10 ≤ X ≤ 70 ) ≥ 1 − = 1− = 2 2 25 k (5 / 2) So at least 84% Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Binomial (Bernoulli) Trials A binomial experiment has the properties: 1. The number of trials in the experiment is fixed. 2. The only outcomes are “success” and “failure.” 3. The probability of success in each trial is the same. 4. The trials are independent of each other.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probabilities in Bernoulli Trials In a binomial experiment in which the probability of success in any trial is p, the probability of exactly x successes in n independent trials is given by

C ( n, x ) p q

x n− x

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. A card is drawn from a standard 52-card deck. If drawing a club is considered a success, find the probability of a. exactly one success in 4 draws (with replacement). 1 1 3 x n− x where p = , q = 1 − = C ( n, x ) p q 4 4 4 1 3 ⎛1⎞ ⎛3⎞ C ( 4,1) ⎜ ⎟ ⎜ ⎟ ≈ 0.422 ⎝4⎠ ⎝4⎠ b. no successes in 5 draws (with replacement). 0

5

⎛1⎞ ⎛3⎞ C ( 5, 0 ) ⎜ ⎟ ⎜ ⎟ ≈ 0.237 ⎝4⎠ ⎝4⎠ Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Mean, Variance, and Standard Deviation of a Random Variable X If X is a binomial random variable associated with a binomial experiment consisting of n trials with probability of success p and probability of failure q, then the mean, variance, and standard deviation of X are

µ = E ( X ) = np

Var ( X ) = npq

σ X = npq Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. 5 cards are drawn, with replacement, from a standard 52-card deck. If drawing a club is considered a success, find the mean, variance, and standard deviation of X (where X is the number of successes). 1 1 3 p = , q = 1− = 4 4 4 ⎛1⎞ µ = np = 5 ⎜ ⎟ = 1.25 ⎝4⎠ ⎛ 1 ⎞⎛ 3 ⎞ Var ( X ) = npq = 5 ⎜ ⎟ ⎜ ⎟ = 0.9375 ⎝ 4 ⎠⎝ 4 ⎠

σ X = npq = 0.9375 ≈ 0.968 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. If the probability of a student successfully passing this course (C or better) is 0.82, find the probability that given 8 students a. all 8 pass. b. none pass.

C ( 8,8 )( 0.82 ) ( 0.18 ) 8

0

≈ 0.2044

C ( 8, 0 )( 0.82 ) ( 0.18 ) 0

≈ 0.0000011

8

c. at least 6 pass. so 6, 7, and 8 successes C ( 8, 6 )( 0.82 ) ( 0.18 ) + C ( 8, 7 )( 0.82 ) ( 0.18 ) 6

2

+C ( 8,8 )( 0.82 ) ( 0.18 ) 8

7

1

0

≈ 0.2758 + 0.3590 + 0.2044

= 0.8392 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Density Function A probability density function, f, defines a continuous probability distribution and coincides with the interval of values taken on by the random variable associated with an experiment. 1. f (x) is nonnegative for all values of x. 2. The area of the region between the graph of f and the x – axis is equal to 1.

y = f ( x)

Area = 1 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Probability Density Function P(a < X < b) is given by the area of the shaded region.

y = f ( x)

a

b Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal Distributions Normal distributions are a special class of continuous probability density functions. Many phenomena have probability density functions that are normal. The graph of this distribution is called a normal curve. The probability density function associated with the normal curve:

1 − (1/ 2)[( x − µ ) / σ ]2 f ( x) = e σ 2π Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal Curve Properties 1. The peak is at x = µ . 2. There is symmetry with respect to the line x = µ . 3. The curve lies above and approaches the x–axis. 4. The area under the curve is 1. 5. 68.27% of the area lies within 1 standard deviation of the mean, 95.45% within 2, and 99.73% within 3 (see curve on next slide).

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal Curve Percentage of area within given standard deviations. 99.73% 95.45% 68.27%

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Normal curves with the same standard deviation but different means

Normal curves with the same mean but different standard deviations.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Standard Normal Distribution Denoted by the variable Z, with

µ = 0 and σ = 1.

Ex. Let Z be the standard normal variable. Find (from table) a. P(Z < 0.85) This is the area to the left of 0.85

0.8023

b. P(Z > 1.32) Use the fact that this area is equivalent to finding P(Z < –1.32)

0.0934

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

c. P(–2.1 < Z < 1.78) Find the area to the left of 1.78 then subtract the area to the left of –2.1.

P(Z < 1.78) – P(Z < –2.1) 0.9625 – 0.0179 = 0.9446

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Let Z be the standard normal variable. Find z if a. P(Z < z) = 0.9278. Look at the table and find an entry = 0.9278 then read back to find

z = 1.46. b. P(–z < Z < z) = 0.8132 P(z < Z < –z ) = 2P(0 < Z < z) = 2[P(Z < z ) – ½] = 2P(Z < z) – 1 = 0.8132 P(Z < z) = 0.9066 z = 1.32 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Transforming Other Normal Distributions into a Standard Normal Distribution Given X, a normal random variable distribution with mean = µ and standard deviation = σ , We can transform X to Z using:

Z=

X −µ

σ Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. Let X be a normal random variable with µ = 80 and σ = 20. Find a. P(X < 65) b. P(X > 60) a. P(X < 65) Convert to standard normal

65 − 80 ⎞ ⎛ P ( X < 65 ) = P ⎜ Z < ⎟ 20 ⎝ ⎠

= P ( Z < −.75 ) = 0.2266 Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

b. P(X > 60) Convert to standard normal

60 − 80 ⎞ ⎛ P ( X > 60 ) = P ⎜ Z > ⎟ = P ( Z > −1) 20 ⎠ ⎝

= 1 − P ( Z < −1) = 1 − 0.1587 = 0.8413

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Ex. A particular rash has shown up at an elementary school. It has been determined that the length of time that the rash will last is normally distributed with

µ = 6 days and σ = 1.5 days.

a. Find the probability that for a student selected at random, the rash will last for less than 3 days. b. Find the probability that for a student selected at random, the rash will last for between 3.75 and 9 days.

Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.

a. Find the probability that for a student selected at random, the rash will last for less than 3 days. 3−6 ⎞ ⎛ P ( X < 3) = P ⎜ Z < ⎟ = P ( Z < −2 ) 1.5 ⎠ ⎝

= 0.0228 b. Find the probability that for a student selected at random, the rash will last for between 3.75 and 9 days. 9−6⎞ ⎛ 3.75 − 6