241A-Probability, Statistics and Econometrics

241A-Probability, Statistics and Econometrics Paulina Oliva Department of Economics University of California, Santa Barbara Part 1: Probability the...

Author: Elmer Summers

9 downloads 0 Views 791KB Size

Report

Download PDF

Recommend Documents

Statistics and econometrics

241A-Probability, Statistics and Econometrics

Econometrics I Lecture 2: Statistics

Applied Informatics Econometrics Statistics Accounting

Department of Econometrics and Business Statistics

Econometrics

Kai Carstensen. University of Kiel, Institute for Statistics and Econometrics, Research Assistant,

Professor McDonald s early research was in statistics and econometrics. Some publications include:

alphanumeric journal The Journal of Operations Research, Statistics, Econometrics and Management Information Systems

Applied Econometrics and International Development

Econometrics in Retrospect and Prospect

MSc in Economics and Econometrics

alphanumeric journal The Journal of Operations Research, Statistics, Econometrics and Management Information Systems

Applied Econometrics and International Development

Applied Econometrics

Spatial Econometrics

aggregation (econometrics)

Econometrics Midterm

Introductory Econometrics

INTERMEDIATE AND ADVANCED ECONOMETRICS Problems and Solutions

Applied Econometrics and International Development Vol (2015)

241A-Probability, Statistics and Econometrics Paulina Oliva

Department of Economics University of California, Santa Barbara

Part 1: Probability theory

1. Introduction 2. Probability 2.1 2.2 2.3 2.4

Sets Probability Conditional probability and independence Random Variables

3. Common Distributions 4. Transformations and Expectations 5. Multiple random variables

Random variables A random variable is a function from a sample space S into the real numbers. 1. Number of green shirts on a given day in a class of five people: S = {(g , o, o, o, o), (g , g , o, o, o), (g , g , g , o, o), (g , g , g , g , o), (g , g , g , g , g ), (o, g , o, o, o), ...}; the range of the random variable X , associated with this example would be X = {0, 1, 2, 3, 4, 5}

2. Number of heads in three coin tosses.

S = {(HHH), (HHT ), (HTH), (THH), (TTH), (THT ), (HTT ), (TTT )}; X = {0, 1, 2, 3}

3. Give a lotery prize to a random individual at time t, denote X the share of the prize spent in time t. X = [0, 1]

Probability function Given random variable X that is a function from sample space S, the probability function PX on X , when X is countable, is defined in the following way: PX (X = xi ) = P({sj ∈ S : X (sj ) = xi })

Example (Three coin tosses for a dollar): Define X as the number of dollars someone earns in a game that consists of three coin tosses, where heads gives you one dollar and tails takes away one dollar. In this game, the outcome space is just as in example number 2 from the previous slide. The space of the random variable X is X = {−3, −1, 1, 3}, and

PX (X = x)

-3

-1

1

3

1 8

3 8

3 8

1 8

Probability function (contd) Sometimes, however, a complete listing will not be possible, but we can find a function that maps X into [0, 1]. E xample 1.5.3 Tossing a coin until a head appears. S = {H, TH, TTH, TTTH, TTTTH, ...}. Notice that the sample space is infinite. The random variable X is defined as the number of tosses we need to get a head. So, in this case, X = 1, 2, 3, 4, .... What is the probability of getting a head in the third toss? PX (X = 3) = (1 − p) × (1 − p) × p,

where p is the probability of getting a head on a given coin toss. More generally,

PX (X = x) = (1 − p)x−1 p for x = 1, 2, ..., and PX (X = x) = 0 otherwise

Cumulative distribution function of a random variable

The cumulative distribution function or cdf of a random variable, X , denoted by FX (x) = PX (X ≤ x), for all x. T heorem 1.5.3 The function F (x) is a cdf if and only if the following three conditions hold: a. limx→−∞ F (x) = 0 and limx→∞ F (x) = 1 b. F (x) is a nondecreasing function of x. c. F (x) is right-continuous; that is, for every number x0 , limx↓x0 F (x) = F (x0 )

Example (Three coint tosses for a dollar)

From the example of the Three coin tosses for a dollar, we can compute the cdf very easily from the probabilities of each outcome:   0 if x < −3     1   8 if − 3 ≤ x < −1    4 if − 1 ≤ x < 1 8 FX (x) = 7   8 if 1 ≤ x < 3     1 if 3 ≤ x    

Example (Three coint tosses for a dollar) (contd) This is a step function of x. Figure:

 





Example 1.5.3 (Coin Toss)

Probability function:

PX (X = x) = (1 − p)x−1 p for x = 1, 2, ..., and PX (X = x) = 0 otherwise �

What is the cdf?

�

Does it meet the properties of a cdf?

Discrete vs. Continuous

Notice that the cdf’s in both examples are step functions. A random variable X is continuous if FX (x) is a continuous function of x. A random variable is discrete if FX (x) is a step function of x.

Identically distributed random variables

The random variables X and Y are identically distributed if, for every set A ∈ B, P(X ∈ A) = P(Y ∈ A). �

This does not mean that these two variables are equal.

�

E.g. Variable X is number of heads in 10 coin tosses and variable Y is the number of pair numbers after rolling the dice 10 times. Assume that both the coin and the dice are fair.

Identically distributed random variables (contd) The probability of heads in the case of the coin toss is then 12 ; and in the case of the dice, it is: P({2} ∪ {4} ∪ {6}) = 3 × 16 = 12 . Since the probability of a “success” (either a head or a pair number) is the same for both experiments, it is true for both random variables that � � 10 P(Y = a) = P(X = a) = p 10−a (1 − p)a , a with p = 12 . Hence the cdf is given by

� a � � 10 FX (a) = FY (a) = p 10−a (1 − p)a , a i=1

where a = 0, 1, ..., 10 and it is the same for both variables: i.e. the variables are identically distributed. This distribution is called Binomial and is a also a step function of a.

Probability mass function (pmf) The probability mass function (pmf) of a discrete random variable X is given by fX (x) = P(X = x) for all x There is no distinction between the probability mass function and P(X = x) for a discrete variable, such as for the geometric distribution or the binomial distribution from the previous slide. Both, the pmf and the cdf are enough to characterize the distribution of a variable. Hence, we could have said that the number of heads in 10 tosses and the number of pairs in 10 dice rolls are identically distributed just by noting that � � 10 fX (a) = fY (a) = p 10−a (1 − p)a a where a = 0, 1, ..., 10

Probability density function (pdf) �

pdf is the analogous concept to pmf but for continuous variables

�

Recall that a continuous random variable is such that its cdf is continuous.

�

In the continuous case the outcomes associated with the experiment cannot be listed.

�

An example of a continuous random variable are the cubic centimeters of rain in a certain area.

�

An example of a continuous cdf is 1 FX (x) = 1 + e −x Does it meet all conditions from Theorem 1.5.3?

pdf (contd)

In the continuous case it not longer holds that the pdf (denoted fX (x) is equal to the probability of x):

Why? In the continuous case:

fX (x) �= P(X = x)

P(X = x) = 0

pdf (contd) How do we define the pdf? The analgous concept for the sum in the continuous case is the integral: The probability density function or pdf, fX (x), of a continuous random variable X is the function that satisfies ˆ x FX (x) = fX (t)dt for all x −∞

If fX (x) is continuous, we can use the Fundamental Theorem of Calculus to claim that d FX (x) = fX (x) dx Notice that, in contrast with P(X = x), fX (x) is positive for all values of x in the support set (why?), x ∈ X , and we cannot longer interpret it as the probability of X = x.

pdf (contd)

What is the pdf of X , such that FX (x) = fX (x) =

1 ? 1+e −x

e −x (1 + e −x )2

pdf (contd)

Lets consider another simple example of a continuous distribution: the Rectangular (or continuous uniform). The continuous uniform gives the same probability to each value of X in the interval [a, b]. Hence, f (x) = with f (x) = 0 elsewhere.

1 for a ≤ x ≤ b, b−a

pdf (contd) Figure:





pdf (contd)

What is the cdf of the continuous uniform distribution? ˆ x 1 FX (x) = dt a b−a FX (x) =

x −a b−a

Properties of pdf

T heorem 1.6.5 A function fX (x) is a pdf (or pmf) of a random variable X if and only if a. fX (x) ≥ 0 for all x ´∞ � b. x fX (x) = 1 (pmf) or −∞ fX (x)dx = 1 (pdf)

Properties of pdf a. fX (x) ≥ 0 for all x

Follows from the second property of a cdf: F (x) is a nondecreasing function of x. This property and the definition of f (x) implies (a). The converse is also true (a non-negative derivative produces a function that is non-decreasing in x) ´∞ � b. x fX (x) = 1 (pmf) or −∞ fX (x)dx = 1 (pdf)

This conclusion follows from the first property of cdf’s and the definition ´of a pdf. Since, limx→∞ FX (x) = 1 and x FX (x) = −∞ fX (t)dt, we can take limits to both sides and get the implication from cdf to pdf. The reverse implication is also quite easy to prove.

Example: Becker Model for Discrimination Utility of an employer U =Π−d �

where Π are monetary profits

�

d ∈ [0, 100] is the “disutility” or distaste for minority workers from this employer U = pG (NA + NB ) − wA NA − wB NB − dNB

�

where NA and NB are the number of non-minority and minority workers (e.g. white and black; male and female; white and hispanic; etc.)

�

G (.) is a production function that only depends on labor (all workers have the same productivity)

�

wA is the market wage for non-minority workers and wB is the market wage for minority workers

Example: Becker Model for Discrimination

max = U = pG (NA + NB ) − wA NA − wB NB − DNB

NA ,NB

pG � (NA + NB ) = wA pG � (NA + NB ) − D = wB

What would this employer pay for one more minority worker?

Example: Becker Model for Discrimination pdf of D across employers









Example: Becker Model for Discrimination

�

Denote the proportion of minority workers as Pr(B)

�

What is the proportion of employers with taste preference less than d ?

Example: Becker Model for Discrimination

 







Example: Becker Model for Discrimination w A = wB

 







Example: Becker Model for Discrimination w A − wB = d ∗

 









Part 1: Probability

1. Introduction 2. Probability 3. Common Distributions 3.1 Notation 3.2 Discrete 3.3 Continuous

4. Transformations and Expectations 5. Multiple random variables

Distributions

�

In the previous chapter we derived some pmf’s, pdf’s and cdf’s. In these cases the characteristics of the experiment allowed us to determine the exact functional form of those distributions.

�

Some times, exact knowledge on the distribution functional form is not available. E.g. what is the distribution of the change in consumption from t0 to t1 resulting from a randomly assigned cash transfer?

�

Many unknown distributions of random variables like the one above approximate common distributional forms.

Distribution Notation Most distribution functions depend on parameters that determine some aspects of their shape. I will denote fX (x|θ) as the pmf or pdf for random variable X given the parameter values θ. Similarly, the corresponding cdf will be denoted FX (x|θ) Whenever the random variable we refer to is unambiguous, we will omit the subscript notation and write f (x|θ) and F (x|θ)

Discrete distributions The Bernoulli pmf with parameter p, where 0 ≤ p ≤ 1 f (x|p) = p x (1 − p)(1−x) for x = 0, 1

The distribution of a random variable that takes the value of one if a fair coin toss results in head would be: f (x|p = 2) =

1x 2

�

1−

f (x|p = 2) =

1 2

�1−x

1 2

Assuming the probability of head is 12 . In an experiment where x = 1 if an adult picked at random from the population of U.S. adults this month is unemployed, then p would be equal to the population’s rate of unemployment.

Discrete Uniform A Discrete Uniform pmf with parameter N, where N is a positive integer, is f (x|N) = and

1 for x = 1, 2, ..., N N

f (x|N) = 0 elsewhere.









Binomial A binomial distribution with parameters n, p where n is a positive integer and 0 ≤ p ≤ 1 is given by n! f (x|n, p) = p x (1−p)(n−x) = x!(n − x)

�

n x

�

p x (1−p)(n−x) for x = 0, 1, 2

and f (x|n.p) = 0 elsewhere This probability function was exemplified in the previous lecture with the example for the number of pair numbers after throwing the dice 10 times. In this particular case, n = 10 and p = 12 . If we instead defined the random variable x as the number of fours after throwing a fair dice 10 times, then n = 10 and p = 16 .

Example of Binomial

E.g. If the probability of getting one job oﬀer is the same for every month, what is the probability that a person receives at least one job oﬀer in a period of 6 months? n=6 p = probability of job oﬀer in a given month X = number of job oﬀers P(X > 1) = 1 − P(X ≤ x)