2009 Winton. Empirical and Discrete Distributions

Author: Teresa Snow

18 downloads 2 Views 214KB Size

Report

Download PDF

Recommend Documents

Discrete Distributions

Discrete Probability Distributions

Discrete Random Variables and Their Probability Distributions

Chapter 5 Discrete Probability Distributions

Chapter 5: Discrete Probability Distributions

Chapter 5. Discrete Probability Distributions

Discrete Probability Distributions. Chapter 6

Bayesian Entropy Estimation for Countable Discrete Distributions

Discrete Probability Distributions. Lecture 7 - Continuous Distributions. Continuous Probability Distributions. Normal distribution

1 Distributions and Expected Values for Discrete Random Variables

Bio-Statistics. Discrete Random Variables and their Probability Distributions Examples

Announcements. Unit 2: Probability and Distributions Lecture 3: Normal distribution. Discrete probability distributions

Comparing discrete distributions when the sample space is small

Probability Distributions for Discrete Random Variables. A discrete RV has a finite list of possible outcomes

Chapter 21 The Electric Field I: Discrete Charge Distributions

SEASONS GREETINGS FROM WINTON

An empirical goodness-of-fit test for multivariate distributions

WINTON OIL TITANIUM WHITE

Tim Winton Schwindel

ESCUELA SECUNDARIA WINTON

Camp Winton, BSA Leader Guide

NEWS November Bobby Winton and Henry de Silva Winton Cup Photo Rob Comerford

Discrete fourier series and Discrete fourier transform

Discrete Cilia Modelling with Singularity Distributions: Application to the Embryonic Node and the Airway Surface Liquid

© 2009 Winton

1

Empirical and Discrete Distributions

© 2009 Winton

2

E i i l Distributions Empirical Di t ib ti • An empirical distribution is one for which each possible event is assigned a probability derived f from experimental i l observation b i – It is assumed that the events are independent and the sum of the probabilities is 1

• An empirical distribution may represent either a continuous or a discrete distribution – If it represents a discrete distribution, then sampling is done “on step” p – If it represents a continuous distribution, then sampling is done via “interpolation”

© 2009 Winton

3

Discrete vs. Continuous Sampling • The way the empirical table is described usually d determines i if an empirical i i l distribution di ib i is i to be b handled discretely or continuously discrete description value probability 10 .1 20 .15 35 .4 40 .3 60 .05

continuous description value probability 0 – 10.1 10 – 20.15 20 – 35.4 35 – 40.3 40 – 60.05

© 2009 Winton

4

Empirical Distribution Linear Interpolation rsample

discrete case continuous case value prob value prob 10 .1 0 – 10- .1 20 .15 10 – 20- .15 35 .4 4 20 – 3535 .44 40 .3 35 – 40- .3 60 .05 40 – 60- .05

60 50 40 30 20 10

x

0

0

.55

1

• To use linear interpolation for continuous sampling, the discrete points on the end of each step need to be connected by line segments – This is represented in the graph by the green line segments – The steps are represented in blue

© 2009 Winton

5

Sampling On Step rsample

discrete case continuous case value prob value prob 10 .1 0 – 10- .1 20 .15 10 – 20- .15 35 .4 4 20 – 3535 .44 40 .3 35 – 40- .3 60 .05 40 – 60- .05

60 50 40 30 20 10

x

0

0

.55

1

• In the discrete case, sampling on step is accomplished by accumulating probabilities from the original table – For x = 0.4, accumulate probabilities until the cumulative probability exceeds 0.4 – rsample is the event value at the point this happens • the cumulative probability 0.1+0.15+0.4 is the first to exceed 0.4, – the rsample value is 35

© 2009 Winton

6

Sampling By Linear Interpolation rsample

discrete case continuous case value prob value prob 10 .1 0 – 10- .1 20 .15 10 – 20- .15 35 .4 4 20 – 3535 .44 40 .3 35 – 40- .3 60 .05 40 – 60- .05

60 50 40 30 20 10

x

0

0

.55

1

• In the continuous case, the end points of the probability accumulation are needed – For x = 0.4, the values are x=0.25 and x=0.65 which represent the points (.25,20) and (.65,35) on the graph – From basic college algebra, the slope of the line segment is (35-20)/(.65-.25) = 15/.4 = 37.5 – slope = 37.5 = (35-rsample)/(.65-.4) so rsample = 35 - (37.5.25) = 35 – 9.375 = 25.625

© 2009 Winton

Discrete Distributions • Historical perspective behind the names used with discrete distributions – James Bernoulli (1654-1705) was a Swiss mathematician whose book Ars Conjectandi j (published pposthumously (p y in 1713)) was the first significant book on probability • It gathered together the ideas on counting, and among other things provided a proof of the binomial theorem – Siméon-Denis Poisson (1781-1840) was a professor of mathematics at the Faculté des Sciences whose 1837 text Recherchés sur la probabilité des jugements en matière criminelle et en matière civile introduced the discrete distribution now called the Poisson distribution

• Keep in mind that scholars such as these evolved their theories with the objective of providing sophisticated abstract models of p real-world phenomena – An effort which, among other things, gave birth to the calculus as a major modeling tool

7

© 2009 Winton

I. Bernoulli Distribution Bernoulli events, trials, and processes • Bernoulli event – One for which the probability the event occurs is p and the probability the event does not occur is 11-p p • The event is has two possible outcomes (usually viewed as success or failure) occurring with probability p and 1-p, respectively

• Bernoulli trial – An instantiation of a Bernoulli event

• Bernoulli process – A sequence of Bernoulli trials where probability p of success remains the same from trial to trial – This means that for n trials,, the pprobabilityy of n consecutive successes is pn

8

© 2009 Winton

9

Bernoulli Distribution • Bernoulli distribution – Given by the pair of probabilities of a Bernoulli event • Too simple to be interesting in isolation – Implicitly used in “yes-no” decision processes where the choice occurs with the same pprobabilityy from trial to trial • e.g., the customer chooses to go down aisle 1 with probability p – It can be cast in the same kind of mathematical notation used to describe more complex distributions

© 2009 Winton

10

Bernoulli Distribution pdf pz((1-p) p)1-z for z = 0,1 ,

p(z) =

0

p(z)

otherwise

1 

Note: this seemingly bizarre description is chosen since it mimics the case of the binomial distribution when n=1 

p



1-p 0

1

z

• The expected value of the distribution is given by

E(X) ( )  (1 ( -p))  0  p 1  p

• The standard deviation is given by

σ  (1 - p)(0 - p) 2  p(1 - p) 2  p  (1-p) • Notational overkill, but useful for understanding other distributions

© 2009 Winton

11

Sampling • Sampling from a discrete distribution, distribution requires a function that corresponds to the distribution function of a continuous distribution f x given by

F(x) ( )   f(z)dz ( ) 

• This is given by the mass function F(x) of the distribution, which is the stepp function obtained from the cumulative (discrete) ( ) distribution x given by the sequence of partial sums p(z)

 0

• For F th the B Bernoulli lli distribution, di t ib ti F(x) F( ) has h the th construction t ti 0 for -  x < 0 F(x) ( ) = 1-pp for 0  x < 1 1 for x  1 – A non-decreasing function (an so can be sampled on step)

© 2009 Winton

12

Graph of F(x) and Sampling Function F(x)

• Distribution function F(x)

1



p 

1-p 0

• Sampling function l – for random value x drawn from [0,1), rsample 0 if 0  x < 1-p rsample = 1 1 if 1-pp  x < 1 – demonstrates that sampling from a discrete distribution, even one as simple as the Bernoulli distribution, can be viewed ie ed in the same manner as for continuous distributions 0

z 1

1-p

1

x

© 2009 Winton

13

II Bi II. Binomial i l Distribution Di t ib ti • Th The B Bernoulli lli distribution di ib i represents the h success or failure f il of a single Bernoulli trial • The Binomial Distribution represents the number of successes and failures in n independent Bernoulli trials for some given value of n – For example, if a manufactured item is defective with probability p, then the binomial distribution represents the number of successes and failures in a lot of n items • Unlike a barrel of apples, where one bad apple can cause others to be bad – Sampling p g from this distribution ggives a count of the number of defective items in a sample lot – Another example is the number of heads obtained in tossing a coin n times

© 2009 Winton

14

Binomial o a Theorem eo e (1) ( ) • The binomial distribution gets its name from the bi binomial i l theorem h n  n  k n -k n n! n a  b      a b where    0 k  k  k!(n - k)! • It is worth pointing out that if a = b = 1, 1 this becomes n n n n 1 1  2    0 k • If S is a set of size nn, the number of k element subsets of S is given by n n!

   k!(n ( - k)!  k 

© 2009 Winton

15

Binomial o a Theorem eo e (2) ( ) • This formula is the result of a simple counting analysis: l i there h are n! n  (n - 1)  ...  (n - k  1)  (n - k)!

ordered ways to select k elements from n – n ways to t choose h the th 1st item, it (n-1) ( 1) th the 2nd , and d so on – Any given selection is a permutation of its k elements, so the underlying subset is counted k! times – Dividing by k! eliminates the duplicates

• C Consequence: 2n counts t the th total t t l number b off subsets b t off an n-element set

© 2009 Winton

16

Binomial Distribution pdf (1) • For n independent p Bernoulli trials the ppdf of the binomial distribution is given by p(z) =

n z   p (1  p) n  z for z  0, 1, ..., n z

0 otherwise h i • By the binomial theorem n

 p(z)  (p  (1 - p)) 0

verifying that p(z) is a pdf

n

1

© 2009 Winton

17

Binomial Distribution pdf (2) • Wh When choosing h i z items it from f among n items it with ith probability p for an item being defective, the term n z   p (1  p) n  z z

represents the probability that z are defective (and concomitantly that (n (n-z) z) are not defective)

© 2009 Winton

18

E Expected t d Value V l andd Variance V i • E(X) = np for a binomial distribution on n items where probability of success is p term in common • The calculation is accomplished by

n n z n! n -z E(X)   p(z i )  z i     p (1  p)  z   p z (1  p) n -z  z 1 0 z 1 z! (n - z)! n n  n - 1 z -1  n - 1 z -1 n -z  p (1 - p)  np  np     p (1 - p) n -z  np(p  1 - p) n -1  np    1  z -1 1  z -1 n

n

present in every summand

apply the binomial theorem to this ((note that n-z = ((n-1)) - ((z-1)) )

• It can be similarly shown that the standard deviation is npp  (1  p)

© 2009 Winton

19

Graph of pdf and Mass Function F(x) • The binomial distribution with n=10 and p=0.7 appears as f ll follows: ( (mean = np = 7)

• Its It corresponding di mass function f ti F(z) is given by

© 2009 Winton

20

Sampling Sa p g Function u c o rsample 10 9 8 7 6 5 4 3 2 1 0

0

1

x

• A ttypical i l sampling li tactic t ti is i to t accumulate l t the th sum

rsample

( )  p(z) 0

increasing rsample until the sum's value exceeds the random value between 0 and 1 drawn for x – The final rsample summation limit is the sample value – In contrast to a continuous pdf described by some formula, the function u ct o for o a finite te discrete d sc ete pd pdf has as to be given g ve in its ts relational e at o a form by a table of pairs, which in turn mandates the kind of "search" algorithm approach used above to obtain rsample

© 2009 Winton

21

III Poisson Distribution III. (values z = 0, 1, 2, . . .)

• With a little work it can be shown that the Poisson di ib i is distribution i the h limiting li i i case off the h binomial bi i l distribution using p= /n  0 as n   • The expected value E(X) =  • The standard deviation is λ z λ • The pdf is given by λ e p(z))  p( z!

© 2009 Winton

III. Poisson Distribution history and use

• This distribution dates back to Poisson's 1837 text regarding di civil i il andd criminal i i l matters, in i effect ff scotching hi the h tale that its first use was for modeling deaths from the kicks of horses in the Prussian army • In addition to modeling the number of arrivals over some interval of time the distribution has also been used to model the number of defects on a manufactured article – Recall the relationship to the exponential distribution; a Poisson process has exponentially distributed interarrival times

• In general the Poisson distribution is used for situations where h the th probability b bilit off an eventt occurring i is i very small, ll but the number of trials is very large (so the event is expected to actually occur a few times) – Less cumbersome than the binomial distribution

22

© 2009 Winton

23

Graph p of pdf p and Sampling p g Function p(z)

• With  = 2

mean 0.3

0.25 0.2

rsample

0.15

9

0.1

8

0.05

7

0

z 0 1

6 5 4 3 2 1 0

x 0

1

2 3

4 5 6

7 8 9 10 . . .

© 2009 Winton

24

IV Geometric Distribution IV. • The geometric distribution gets its name from the geometric series: 

1 for r  1,  r n  , 1- r 0



r n n  r  , 0 2 (1 - r)



 (n  1)  r n  0

1 (1 - r) 2

various i flavors fl off the h geometric i series i

• The pdf for the geometric distribution is given by p(z) =

(1  p) z-1  p for z  1, 1 2, 2 ...

otherwise • The geometric distribution is the discrete analog of the exponential distribution – Like the exponential distribution, it is "memoryless"; i.e., P(X > a+b | X > a) = P(X > b) – The geometric distribution is the only discrete distribution with this property just as the exponential distribution is the only continuous one behaving in this manner

© 2009 Winton

25

Memoryless e o y ess Property ope y • Being memoryless is characterized by P(X > a+b | X > a) = P(X > b) • The interpretation is that if we haven’t had an arrival in “a” seconds, the probability of an arrival in the next “b” seconds is the same as if the “a” seconds had not elapsed – Only the geometric and exponential distributions have this property • This is NOT a property associated with independent events, events since among other things there are three events involved: X > a+b, X > a, and X > b – Recall that A and B are independent if P(A ∩ B) = P(A)P(B) – So long as P(B) is non-zero, P(A | B) = P(A ∩ B )/P(B) • If effect, if A and B are independent events and B has already occurred then by this equation P(A) is unchanged; i.e., occurred, i e as one would expect P(A|B) = P(A) when A and B are independent – The events X > a + b and X > a are certainly NOT independent since independence would infer P(X > a+b | X > a) = P(X > a+b) – NOT!

© 2009 Winton

26

Expected Value and Standard Deviation • With expected value is given by 

1 1 E(X)   z(1 - p)  p  p   2 (1 - 1  p) p 1 z -1

– by applying the 3rd form of the geometric series

• The standard deviation is ggiven byy 1 p p

© 2009 Winton

27

Graph • A plot of the geometric distribution with p = 0.3 0 3 is given by p(z)

0 35 0.35 0.3 mean

0 25 0.25 0.2 0 15 0.15 0.1 0 05 0.05

z

0 1

2

3

4

5

6

7

8

9

10

© 2009 Winton

28

Utili ti Utilization • A ttypical i l use off the th geometric t i distribution di t ib ti is i for f modeling d li the number of failures before the first success in a sequence of independent Bernoulli trials • This is the scenario for making sales – Suppose pp that the probability p y of makingg a sale is 0.3 – Then • p(1) = 0.3 is the probability of success on the 1st try • p(2) = (1-p)p = 0.70.3 = 0.21 which is the probability of failing on the 1st try (with probability 1-p) and succeeding on y p) the 2nd ((with pprobability • p(3) = (1-p)(1-p)p = 0.15 is the probability that the sale takes 3 tries, and so forth – A random d sample l from f the h di distribution ib i represents the h number b off attempts needed to make the sale