A Likelihood Approach to Estimating Market Equilibrium Models 1

A Likelihood Approach to Estimating Market Equilibrium Models Michaela Draganska Stanford University Graduate School of Business Stanford, CA 94305-5...
36 downloads 2 Views 218KB Size
A Likelihood Approach to Estimating Market Equilibrium Models

Michaela Draganska Stanford University Graduate School of Business Stanford, CA 94305-5015 draganska [email protected] Dipak Jain Kellogg School of Management Northwestern University Evanston, IL 60208-2001 [email protected]

1

Abstract This paper develops a new likelihood-based method for the simultaneous estimation of structural demand-and-supply models for markets with differentiated products. We specify an individual-level discrete choice model of demand and derive the supply side assuming manufacturers compete in prices. The proposed estimation method considers price endogeneity through simultaneous estimation of demand and supply, allows for consumer heterogeneity, and incorporates a pricing rule consistent with economic theory. The basic idea behind the proposed estimation procedure is to simulate prices and choice probabilities by solving for the market equilibrium. By repeating this many times, we obtain an empirical distribution of equilibrium prices and probabilities. The empirical distribution is then smoothed and used in a likelihood procedure to estimate the parameters of the model. The advantage of this method is that it avoids the need to perform a transformation of variables. If consumers’ tastes are independent across market periods, our approach yields maximum-likelihood estimates; otherwise it yields consistent but not fully efficient partial likelihood estimates.

Key Words: price endogeneity, competitive strategy, maximum likelihood.

1

Introduction

In recent years marketers have become increasingly interested in estimating structural market equilibrium models, where demand is derived from utility maximization on the part of consumers, and the supply side is obtained by assuming that firms maximize profits given the characteristics of the market. Because competitive environment (i.e., market structure) and policy variables (i.e., marketing mix) are specified explicitly, we can identify separate demand, cost, and competitive effects. Estimating a market equilibrium model enables us to analyze questions pertaining to firms’ strategies in the marketplace through “what-if” type analyses by taking into account all interdependencies between the demand and supply sides of the market. The simultaneous estimation of demand and supply is also motivated by the so-called endogeneity problem. In short, endogeneity arises because marketing variables not only affect consumer choice, but because consumer choice also affects marketing mix decisions. It has been well documented that ignoring endogeneity leads to biased coefficient estimates of the marketing mix variables and therefore to suboptimal decisions (Besanko, Gupta and Jain 1998, Villas-Boas and Winer 1999). It is often argued that the use of individual-level data solves the endogeneity problem, since individuals are price takers. However, even though price is exogenous in a microeconomic sense, there still might be important correlations between the price and the error term in the demand equation, thus leading to econometric endogeneity (Kennan 1989). Product attributes that are unobservable to the researcher such as coupon availability, national advertising and shelf space allocation have an impact on consumer utility as well as on price setting decisions by firms (Villas-Boas and Winer 1999, Besanko, Dub´e and Gupta 2003). Prices should thus be viewed as endogenous independent of the aggregation level of the data used in the analysis. In this research, we focus on developing a new likelihood based method for the estimation of structural demand-and-supply models. Our demand model falls into the broad class of discrete 1

choice models of markets for differentiated products (Anderson, de Palma and Thisse 1992). The supply model is derived from the profit maximization behavior of the firms, assuming Bertrand-Nash competition in prices between manufacturers. Market equilibrium is determined jointly by the demand and supply specifications, and our estimation procedure accordingly considers the equilibrium equations simultaneously. Once the presence of unobserved product attributes is acknowledged, it is no longer possible to estimate a discrete choice model using traditional maximum likelihood methods because in this case prices will be correlated with the unobservables due to the strategic price-setting behavior of firms (Berry 1994). Therefore, choice probabilities depend on the unobserved product attributes not only directly but also indirectly via prices. Hence, one cannot integrate the unobserved product attributes out of the choice probabilities without taking this latter dependency into account. Berry (1994) proposed a technique for the estimation of discrete choice models using instrumental variables to account for the endogeneity of prices. His approach is easy to implement and has been widely applied to the analysis of aggregate data (Berry, Levinsohn and Pakes 1995, Besanko et al. 1998, Nevo 2001). Marketing researchers, however, have long recognized the advantages of data describing the purchase behavior of individual consumers. Such disaggregate scanner panel data provide detailed information that can be used to learn about their preferences. For example, they enable us to understand the source of behaviors such as variety seeking or deal proneness. Given the richness of scanner panel data, a large literature has evolved that uses them to estimate discrete choice models of consumer behavior (Guadagni and Little 1983, Kamakura and Russel 1989, Chintagunta, Jain and Vilcassim 1991, G¨on¨ ul and Srinivasan 1993, Fader and Hardie 1996). These models have focused on estimating the demand side and have not considered the possible presence of endogeneity. Recently Goolsbee and Petrin (2003) and Chintagunta, Dub´e and Goh (2003) apply variants of Berry’s (1994) method to estimate consumer demand using individual-

2

level choice data. These approaches are useful when the main interest lies in obtaining precise demand-side estimates because they provide a way to account for price endogeneity without the need of making assumptions about supply-side behavior. If conducting policy experiments is our goal, however, then estimating an equilibrium model is preferable, since it enables us to take advantage of the cross-equation dependencies of the structural parameters. An equilibrium model provides a mapping from unobserved product attributes and cost shocks to market outcomes, i.e., prices and choice probabilities. Extending traditional MLE methods to include a supply side in addition to a consumer choice model is not straightforward because it requires that the researcher is able to write down the joint distribution of these equilibrium outcomes. Assuming that this distribution is known runs counter to the notion of an equilibrium (Berry 1994). Hence, the joint distribution of these equilibrium outcomes needs to be derived from the distribution of the unobservables. Performing this transformation of variables proves to be very difficult due to the highly nonlinear nature of the model. VillasBoas and Winer (1999) circumvent this problem by estimating a reduced-form pricing rule that relates current prices to lagged prices. In a subsequent article, Villas-Boas and Zhao (2001) specify a structural supply-side model derived from manufacturers’ and retailers’ optimization problem and estimate the equilibrium model directly using maximum likelihood. This direct approach to estimating the Jacobian, however, prevents them from incorporating consumer heterogeneity. Recently, Yang, Chen and Allenby (2003) have proposed a Bayesian approach to resolve the issue. In this article, we propose a likelihood based approach to the estimation of a structural demand-and-supply model using individual-level choice data. The basic idea behind the proposed estimation procedure is to simulate prices and probabilities by randomly drawing the shocks from an assumed joint distribution, and then solve for the equilibrium. By repeating this many times, we obtain an empirical distribution of equilibrium prices and probabilities.

3

The empirical distribution is then smoothed and used in a maximum-likelihood procedure to estimate the parameters of the model. The advantage of this method is that it avoids the need to perform a transformation of variables and thus enables us to estimate the model when the evaluation of the Jacobian seems infeasible. In computing the likelihood of the data, we treat market periods as independent from each other. This implicitly assumes that there is no persistence in the preferences of consumers across market periods.2 If markets are geographical regions rather than time periods, then this assumption is warranted. Furthermore, the psychology literature suggests that consumers’ preferences change over time, often depending on contextual effects that are unobserved by the econometrician (Petty and Cacioppo 1986, Burnkrant and Unnava 1995). To the extent that this leads to independence over time, our procedure yields maximum-likelihood estimates of the model parameters. If, on the other hand, there is a correlation in consumers’ preferences over time, then our procedure yields a so-called partial likelihood (Wooldridge 2002), and the resulting estimates are consistent but not fully efficient. The remainder of the paper is organized as follows. Section 2 develops the equilibrium model. In Section 3, the estimation procedure is described along with the details of the implementation. In Section 4, we apply the estimation method to two frequently purchased consumer products, yogurt and laundry detergent. We demonstrate the accuracy of the proposed procedure in a Monte Carlo study presented in Section 5. In Section 6 we conclude with a summary and directions for future research.

2 2.1

Model Formulation Demand Specification

Brands are indexed by j = 0, . . . , J, and market periods by t = 1, . . . , T . Let household types be indexed by n = 1, . . . , N , where a type denotes a set of households with identical

4

demographic characteristics. There are mn individuals of type n. To capture unobserved consumer heterogeneity, we use a latent class approach and specify random coefficients with an L-point distribution (Kamakura and Russel 1989).3 This specification is appealing in terms of interpretability for marketing purposes and has been applied both in the economics and marketing literature (Berry, Carnall and Spiller 1997, Besanko et al. 2003). Let the latent market segments be indexed by l = 1, . . . , L. The share of segment l in the population is P λl ≥ 0, where L l=1 λl = 1. Consumer behavior is governed by the following utility function: un0t = εn0t , unjt = xnjt βl − αl pjt + ξjt + εnjt , where {εn0t , . . . , εnJt } are iid extreme value distributed, xnjt are observed characteristics of an alternative or decision-maker, and pjt denotes the price of alternative j in period t. βl and αl are the respective response parameters. We allow for household-specific variation in these response parameters to capture consumer heterogeneity. The demand shocks {ξ1t , . . . , ξJt } are common across consumers and represent product characteristics that are unobserved by the researcher, but are taken into account by the firms in their pricing decision. While some unobserved product characteristics, such as quality and brand image, can be captured through the inclusion of brand-specific constants, ξjt reflects time-varying factors like coupon availability, shelf space, and national advertising. Brand 0 is the outside good (i.e., no-purchase alternative). Including an outside good allows for category expansion effects of marketing actions. We assume that the outside good is non-strategic, i.e., its price is not set as a best response to the inside goods. Utility maximization and the assumptions on the error term imply that the probability of

5

household n purchasing brand j in market period t, Dnjt , is given by Dnjt =

L X

λl Dnljt =

l=1

L X

exp(xnjt βl − αl pjt + ξjt ) PJ 1 + k=1 exp(xnkt βl − αl pkt + ξkt )

λl

l=1

(1)

and the probability of the outside good being chosen is Dn0t =

L X

λl Dnl0t =

l=1

2.2

L X l=1

λl

1+

1

PJ

k=1 exp(xnkt βl

− αl pkt + ξkt )

.

(2)

Supply Specification

The supply side is characterized by Bertrand-Nash behavior on part of oligopolistic firms (Berry et al. 1995, Besanko et al. 1998). We assume that retailers pass through the manufacturers’ decisions, which is likely to hold for categories that do not have strategic impact on store traffic or are a primary driver of retailers’ profits. Under this assumption we do not need to explicitly include a retailer in the supply-side model. The production function has constant returns to scale. Marginal costs for firm j in period t are denoted by cjt . In market period t firm j maximizes profits, max Πjt = (pjt − cjt ) pjt

where

PN

n=1 mn Dnjt

N X

mn Dnjt ,

(3)

n=1

is the expected demand for product j in period t. Expected demand is

thus given by the weighted sum of the choice probabilities for all consumer types in the market. The first order condition for this problem is given by N X n=1

N

mn

X ∂Dnjt (pjt − cjt ) + mn Dnjt = 0. ∂pjt

(4)

n=1

Given our demand model, the above equation can be rewritten as PN

pjt = cjt +

PL n=1 mn l=1 λl Dnljt PN PL n=1 mn l=1 λl αl Dnljt (1 −

Dnljt )

.

(5)

We infer marginal cost from the data using the relationship cjt = wt γj + ηjt , 6

(6)

where wt are observable variables, e.g. input prices, and ηjt denotes cost characteristics that are unobserved by the researcher. Substituting (6) in (5) yields PN

pjt = wt γj +

2.3

PL l=1 λl Dnljt n=1 mn PN PL n=1 mn l=1 λl αl Dnljt (1 −

Dnljt )

+ ηjt ,

j = 1, . . . , J.

(7)

Market Equilibrium

Considering the demand equations (1) and supply equations (7) jointly, the market equilibrium is defined by

pjt

L X

exp(xnjt βl − αl pjt + ξjt ) , j = 1, . . . , J, n = 1, . . . , N, PJ 1 + exp(x β − α p + ξ ) nkt l l kt kt k=1 l=1 PN PL n=1 mn l=1 λl Dnljt = wt γj + PN + ηjt , j = 1, . . . , J. PL n=1 mn l=1 λl αl Dnljt (1 − Dnljt )

Dnjt =

λl

(8) (9)

In equilibrium, prices and probabilities depend on both {ξjt }j and {ηjt }j . Hence, estimating the equations separately leads to a simultaneity bias: both in equation (8) and (9), the explanatory variables are correlated with the unobserved errors. Consider equation (9) and suppose that firm j faces a high cost shock ηjt in period t. This will lead the firm to charge a higher price pjt which in turn decreases the probability that it will be chosen by consumers of type n, that is, Dnjt decreases. Consequently, the regressor

PN PN

n=1 mn

PL

l=1 λl Dnljt λ α l=1 l l Dnljt (1−Dnljt )

mn PL

n=1

changes with the error ηjt , and this correlation leads to biased estimates for α. A joint estimation of (8) and (9) would account for such possible correlation and hereby lead to a valid estimate of the price coefficient αl .

3

Estimation Procedure

In this section we develop a maximum likelihood-based procedure to obtain estimates of the structural parameters. The model can be written in the general form of a response function, where the endogenous variables are expressed as a function of the exogenous variables. That

7

is, for each market period t, if equilibrium is unique, we have [{Dnjt }n,j , {pjt }j ] = f [{xnjt }n,j , {wjt }j , {ξjt }j , {ηjt }j , θ],

(10)

where θ = ({αl }l , {βl }l , {λl }l , {γj }j , {σξj }j , {σηj }j ) is the vector of parameters to be estimated. As noted previously, the likelihood function of such an equilibrium model is in general intractable. Consider the equilibrium model as defined by equations (8) and (9), where we set N = 1 to simplify exposition. For given values of the exogenous variables, the joint distribution of the demand and supply shocks {ξjt }j and {ηjt }j induces a distribution of the equilibrium prices {pjt }j and probabilities {Djt }j . The difficulty in writing down the likelihood function stems from the fact that this induced distribution of prices and probabilities is hard to obtain directly through a transformation of variables approach. To compute the transformation of the demand and supply shocks, one would need to solve the system of equilibrium equations for {ξjt }j and {ηjt }j and then derive the Jacobian of this inverse transformation. That is, there needs to exist a set of J functions {uj (·)}j that map prices and probabilities into the ξj ’s and another set of J functions {vj (·)}j that map prices and probabilities into ηj ’s. Let h be the pdf of ({ξjt }j , {ηjt }j ). To obtain the pdf g of ({pjt }j , {Djt }j ), the transformation of variables is g({pjt }j , {Djt }j ) = h(u1 ({pjt }j , {Djt }j ), . . . , uJ ({pjt }j , {Djt }j ), v1 ({pjt }j , {Djt }j ), . . . , vJ ({pjt }j , {Djt }j )) · |Jac|, where Jac is the (2J × 2J) Jacobian of the inverse transformation  ∂u1  ∂u1 ∂u1 ∂u1 ∂p1t . . . ∂pJt ∂D1t . . . ∂DJt  . .. .. .. ..   .. . ... . . .    ∂uJ ∂uJ   ∂uJ . . . ∂uJ . . .  ∂p1t ∂pJt ∂D1t ∂DJt   ∂v1 ∂vJ  . ∂v1 ∂v1  ∂p1t . . . ∂pJt ∂D1t . . . ∂D  Jt   .. .. .. ..   ..  . . ... . . .  ∂vJ ∂vJ ∂vJ ∂vJ . . . . . . ∂p1t ∂pJt ∂D1t ∂DJt 8

The problem is that the equilibrium equations generally cannot be solved to obtain the inverse transformations {uj (·)}j and {vj (·)}j . Moreover, even if we could obtain {uj (·)}j and {vj (·)}j , e.g., using numerical methods, then we would still have to compute the Jacobian of this (unknown!) inverse transformation. Due to the highly nonlinear model specification, this is a daunting task. In a recent article, Yang et al. (2003) propose a Bayesian approach to estimating market equilibrium models. While the authors simplify the transformation of variables considerably by transforming the supply shocks into prices conditional on demand shocks, the computation of the Jacobian still needs to be done using numerical methods.4 Our approach is different. We avoid performing the transformation of variables altogether and instead obtain equilibrium prices and probabilities using simulation. Recall that, for given values of the exogenous variables, the joint distribution of demand and supply shocks induces a distribution of prices and probabilities. We exploit this by numerically solving the model repeatedly for simulated demand and supply shocks. For each draw of the demand and supply shocks, we obtain the corresponding equilibrium prices and probabilities. Then we compute the joint distribution of prices and probabilities. There is, however, no guarantee that the empirical distribution of prices and probabilities obtained through the simulation will be smooth, which is a property we need for the optimization. We therefore employ nonparametric techniques to estimate the joint density of prices and probabilities, and then evaluate it at the actual data to obtain a smooth, well-behaved likelihood function. The parameter estimates are obtained by maximizing this likelihood function using an iterative optimization procedure.

Estimation Algorithm Based on the previous discussion, there are three main components to the proposed estimation procedure: (i) simulation of equilibrium prices and probabilities, 9

(ii) estimation of the joint density of prices and probabilities in order to smooth the likelihood function, and (iii) maximization of the loglikelihood to obtain the parameters of the model. The estimation algorithm proceeds as follows. Let s = 1, . . . , S index the simulations per time period t. Step 1: Draw {ξjt }j,t , and {ηjt }j,t S times. Step 2: Choose a starting value for θ. Step 3: Set s = 1. Step 4: Set t = 1. Step 5: Using the sth draw solve pjt

=

wt γj

PN n=1

+

PN

+

ηjt

n=1 mn

P

mn

P l

exp(xnjt βl −αl pjt +ξjt ) k=1 exp(xnkt βl −αl pkt +ξkt )

λl 1+PK

exp(xnjt βl −αl pjt +ξjt ) k=1 exp(xnkt βl −αl pkt +ξkt )

P l λl αl 1+ K

n 1−

exp(xnjt βl −αl pjt +ξjt ) P 1+ K k=1 exp(xnkt βl −αl pkt +ξkt )

o (11)

ˆjt +ξjt ) njt βl −αl p ˆ njt = P λl Pexp(x to obtain {ˆ pjt }j . Using {ˆ pjt }j calculate D l β −α pˆ +ξ 1+ J exp(x k=1

nkt

l

l kt

kt )

.

Step 6: Increase s by 1. If s ≤ S, go back to step 5. ˆ jt }j , {Dnjt }n,j ), at the Step 7: Estimate the joint density of the calculated prices and probabilities, ϕ({p actual prices and probabilities to get the period t contribution to the loglikelihood. Step 8: Increase t by 1. If t ≤ T , then go back to step 5. Step 9: Update θ to maximize the loglikelihood. If convergence is reached, terminate. Else go back to step 3.

We now discuss the details of the estimation algorithm step by step.

3.1

Simulation of Equilibrium Prices and Probabilities

The first component of the estimation procedure is the simulation of equilibrium prices and probabilities. We assume that demand and supply shocks are normally distributed, (ξ1t , . . . , ξJt , η1t , . . . , ηJt ) ∼ N (0, diag(σξ21 , . . . , σξ2J , ση21 , . . . , ση2J )). 10

Further, we assume that {ξjt }j and {ηjt }j are independent across time and independent of {εnjt }n,j . Since all error terms are independent across time and all maximization problems are static, we can treat each period separately. In step 1, we draw the errors only once and use them for all values of θ. McFadden (1989) shows in the context of method of moments estimation that using the same set of random draws to simulate the model at different trial parameter values helps to avoid “chattering” of the simulator, i.e., it ensures that the criterion function would not be discontinuous. Pakes and Pollard (1989) also note that the properties of simulation estimators, and the performance of the algorithms used to determine them, require the use of simulation draws that do not change as the optimization algorithm varies θ. We implement this step by drawing from a standard normal distribution and then multiplying the draws by the standard deviations, σξ1 , . . . , σξJ , and ση1 , . . . , σηJ . Given the exogenous variables, a set of parameter values and random shocks, we solve for the equilibrium prices and probabilities in step 5. Note that equation (11) is differentiable in {pjt }j , so we use a Newton-Raphson gradient procedure. In accordance with theory, we obtain a unique solution along with reasonable values for the equilibrium prices. For each period t, this step is repeated S times to generate an empirical distribution of equilibrium prices and probabilities for this period. The number of random draws (simulations) for the computation of the equilibrium points is to be determined by the user. The minimum number of simulations, however, is set by the requirements of the density estimation procedure as discussed below.

3.2

Estimation of the Joint Density of Prices and Probabilities

In principle, we could use the empirical distribution of the simulated equilibrium prices and probabilities to directly compute the likelihood of the data. However, since the empirical distribution is not smooth, this will in general not lead to a well-behaved likelihood function 11

that is readily optimized. One possibility is to use kernel density estimation to smooth the simulated data points. In a nutshell, kernel smoothing involves weighted local averaging with the kernels as weights. Kernel estimators have two desirable properties: First, they are consistent5 ; second, by averaging over a neighborhood that shrinks at an appropriate rate, they achieve the optimal rate of convergence for nonparametric estimators (Stone 1980). There are other nonparametric techniques that could be considered for smoothing purposes such as splines and orthogonal (Fourier) series (H¨ardle 1990). While these estimators are similar in terms of computational intensity and are asymptotically equivalent, they may differ in their small sample properties. However, to the best of our knowledge, there is no comprehensive Monte-Carlo study or any analytical result that show better small sample properties for any of the these estimators. Therefore, the choice among them is largely a matter of taste.6 We employ a multivariate kernel density estimator with a multiplicative Gaussian kernel (H¨ardle 1990) to evaluate the joint density of the simulated equilibrium prices and probabilities at the actual data to obtain the contribution to the likelihood (step 7). Unlike prices, the probabilities are not directly observed but they are nonparametrically estimable from the data. Formally, the estimated joint density of the calculated prices and probabilities at the actual data is given by: S J 1 XY 1 ϕ({p ˆ jt }j , {Dnjt }n,j ) = K S hpj s=1 j=1

Ã

psjt − pjt hpj

!

J Y N Y 1 K hD j=1 n=1 nj

Ã

s −D Dnjt njt

hD nj

where s indexes simulations, K(·) is the Gaussian kernel defined as K(u) =

√1 2π

! ,

(12)

³ 2´ exp − u2 ,

and hpj and hD nj are smoothing parameters defined below. One well-known problem of nonparametric density estimation is the so called ‘curse of dimensionality,’ i.e., the explosion of number of data points needed for estimation of higherdimensional densities. The minimum number of data points required for the estimation has been tabulated in Silverman (1986, Table 4.2). Because we simulate the equilibrium prices and 12

probabilities, this is only a computational issue in the present case: we can simulate as many data points as needed by drawing more errors. For example, if there are two competing brands in the market, that is, if we want to estimate a four-dimensional density (two prices and two choice probabilities), we need 223 data points. For three brands, the required sample size is 2790, and for four brands 43700 observations are needed. Nonparametric density estimation requires the choice of a smoothing parameter, bandwidth, that governs the degree of smoothness of the density estimate. In essence, the bandwidth determines how much averaging we want to do around a given point. Naturally, the larger the bandwidth, the smoother the function. However, with increasing bandwidth, the estimated density may be far from the true underlying density. If the bandwidth is chosen too small, however, the obtained density would have a ‘rough’ surface and will not be as easy to use for optimization. Therefore, the choice of bandwidth is critical.7 We are only interested in the density at the point of the actual data, so we have to choose the bandwidth to be locally optimal. Intuitively, if we choose the bandwidth too large, the estimated density is essentially constant throughout the parameter space. Hence, every point is a global maximum (up to computer precision). On the other hand, if we choose the bandwidth too small, the estimated density is roughly zero outside a small neighborhood of the observed values, and every point is a local maximum. Searching for the global maximum is a tantamount to searching for the small neighborhood of the parameter space that is associated with positive density. Recall that we estimate a J(N + 1)-dimensional density. We select the bandwidth for the ith dimension according to the normal reference rule (Scott 1992), µ hi =

4 J(N + 1) + 2

¶1/(J(N +1)+4)

σi S 1/(J(N +1)+4) ,

(13)

where σi is the standard deviation of the equilibrium prices (probabilities) in the ith dimension. This rule is known to oversmooth when the underlying density is multimodal. In our case 13

this is welcome, because we want to ensure a well-behaved likelihood function that is easy to maximize. The drawback is, of course, that the estimated density may be far from the underlying density. However, this problem can be alleviated by successively decreasing the bandwidth once we are in the neighborhood of the global maximum.

3.3

Maximization of the Loglikelihood

The maximum likelihood procedure builds an outer loop around the simulation of the equilibrium prices and probabilities and the estimation of their joint density. Thus, for each set of parameter values θ we perform S simulations to obtain the loglikelihood function, which is in turn maximized to obtain an updated set of parameter values. Starting values for the parameters in step 2 may come from a preliminary estimation using 3SLS with aggregate data as in Besanko et al. (1998): ln Sjt − ln S0t = xjt β − αpjt + ξjt , j = 1, . . . , J, pjt = wt γj +

1 + ηjt , α(1 − Sjt )

j = 1, . . . , J,

(14) (15)

where Sjt is the share of alternative j for week t in the aggregate data. For the standard deviations, (σξ1 , . . . , σξJ , ση1 , . . . , σηJ ), we set the starting values equal to the root mean squared error (RMSE) from the 3SLS procedure. Note that in equations (14) and (15) the ξjt and ηjt enter as linear disturbances. Therefore, the RMSE provides an estimate of the standard deviation of the errors. The maximization of the loglikelihood is accomplished by means of a simplex search (Nelder and Mead 1965). It is clearly infeasible to compute analytic gradients. While it is in principle possible to use numerical gradients as part of a gradient-based optimization procedure, we found that a simplex search that uses only the values of the loglikelihood function performs best.8

14

4

Empirical Analyses

The algorithm described in Section 3 is applied to estimate an equilibrium model of demand and supply in two product categories, yogurt and laundry detergent. While we account for unobserved consumer heterogeneity, we abstract from observed heterogeneity, that is, we assume N = 1.

Data. We use data on individual purchase histories for a panel of households in Sioux Falls, South Dakota, collected by A.C. Nielsen. The data set spans a period of 114 weeks in 19861988.9 There is information about the dates of the shopping trips of 615 households, who purchased in the category more than twice, price paid, and item purchased (UPC). We aggregate over all UPCs that belong to a brand. That is, in the yogurt category we aggregate across different flavors, and in the laundry detergent category, we aggregate across different sizes. To obtain weekly no-purchase probabilities, we tried out two alternative approaches. The first was to condition on store visits, i.e., to compute the probability that a household goes to the store but does not purchase in the category of interest (for details see Besanko et al. (1998) and Draganska and Jain (2003)). The second approach is to assume that each household makes a weekly decision whether to purchase in the category or not, i.e., we do not condition on store visit when computing the no-purchase probability. The empirical results did not differ qualitatively for the two approaches, so we decided to not condition on store visits. For the cost shifters in the supply equation we obtained monthly data on labor and materials prices from the Bureau of Labor Statistics (BLS). Labor costs are represented by average hourly earnings of production workers for the respective industry (SIC 202, dairy products, for yogurt and SIC 2841, soap and other detergents, for laundry detergent). We also use data on the prices of the main ingredient for each of the product categories. Specifically, we obtained data on the producer price indices for fluid milk (yogurt) and basic inorganic chemicals (laundry

15

Items Yogurt Yoplait Dannon No purchase Detergent Wisk Tide No purchase

Table 1: Descriptive Statistics of Data. Avg. Choice Prob. Avg. Price Material cost

Labor cost

0.0654 0.0377 0.8969

9.9425 (cent per oz.) 8.0693 (cent per oz.)

103.3066

9.5512

0.0192 0.0484 0.9325

0.0481 ($ per oz.) 0.0512 ($ per oz.)

96.0997

14.1633

detergent). The monthly data series were then smoothed to obtain weekly cost data following the approach suggested by Slade (1995). Table 1 presents descriptive statistics for both product categories along with summary information on the cost data we use for the analysis. In the yogurt category, we focus our attention on the two major competitors in the single-serving yogurt market, Dannon and Yoplait (General Mills).10 Yoplait is the market leader, with a market share almost double that of Dannon and a somewhat higher price. In the laundry detergent category, we study the competition between the two leading brands Wisk (Unilever’s flagship brand) and Tide (Procter& Gamble’s premier brand). Tide seems to dominate Wisk, it has both a higher market share and commands a price premium.

Estimation. We implemented the estimation algorithm in C++ using optimization routines and routines for solving systems of nonlinear equations from Press, Teukolsky, Vetterling and Flannery (1993).11 The parameter estimates are obtained as follows. We estimate an aggregate model as in Besanko et al. (1998) to get starting values. These initial values are perturbed and used in the optimization program. The output parameters are compared based on the values of the loglikelihood function. The parameters with the largest loglikelihood are then perturbed again and taken as input to the optimization procedure. The output parameters are

16

again compared based on the loglikelihood value. The final parameters are chosen to be those with the largest loglikelihood value. By using different starting values, we make sure that the optimization algorithm achieves the global maximum. To compute the standard errors we employ bootstrap. To this end, we simulate 30 data sets by randomly drawing with replacement from the original data. Note that unobserved heterogeneity introduces dependencies across the purchases of a given household. To account for this, in computing the standard errors, it is important to sample entire household histories.12 For each data set, we then run the optimization program. Finally, we compute the standard errors as the standard deviation of the 30 sets of parameter estimates. We use 1000 simulation draws to ensure greater precision of the estimates (recall that for a four-dimensional density we only need 223 data points (Silverman 1986)). Convergence is reached with up to 300 function calls for the model without heterogeneity and about 600 function calls for the model with heterogeneity. An evaluation of the likelihood function takes between 2 and 10 seconds depending on the model specification on a Pentium 4 PC with 1GHz clock speed and 512MB RAM.

Yogurt category. Table 2 presents the results of the empirical analysis. In addition to the homogenous logit, which is our baseline model, we estimated two heterogeneity specifications: one where only the price response of the two segments is different (heterogeneity 1), and one where we also allow for heterogeneity in the brand constants (heterogeneity 2). All estimated coefficients have face validity. The price coefficients are negative and both the wage rate and the price of milk have a positive impact on price as expected. There does not appear to be much qualitative difference in the estimates for the standard logit and the heterogeneous logit specification, where only price response is allowed to vary by segment. However, the estimated parameter for the proportion of segment 1, λ = 12%, is significant. Once heterogeneity in the brand constants is introduced (heterogeneity 2), how17

ever, the difference in the estimated parameters relative to the homogenous logit specification becomes much more pronounced. There is now a sizeable difference in the price sensitivity between the two segments, with the slightly larger segment (57%) being the less price sensitive one. AIC and BIC both show considerable improvement as we go from heterogeneity 1 to heterogeneity 2. The estimated marginal costs are positive, and of reasonable magnitude: 8.91 cents per ounce for Yoplait and 7.08 cents per ounce for Dannon.13 Table 2: Parameter estimates and standard errors for yogurt data. No Heterogeneity Heterogeneity 1 Heterogeneity 2 Variable Coefficient (Std. dev.) Coefficient (Std. dev.) Coefficient (Std. Dev.) Demand Side: Dannon const. (segm. 1) 5.7064 -0.0428 5.8374 -0.0651 0.9759 0.0471 Yoplait const. (segm. 1) 8.2857 -0.0278 8.3503 -0.0694 4.1928 0.0552 Dannon const. (segm. 2) 10.3163 0.0376 Yoplait const. (segm. 2) 6.8511 0.0765 σξ1 0.3198 -0.0344 0.3584 -0.0337 0.3047 0.0336 σξ2 0.3093 -0.0411 0.4131 -0.0602 0.352 0.03 price (segment 1) -1.112 -0.0047 -0.8999 -0.0382 -0.6319 0.0138 price (segment 2) -1.2052 -0.0214 -1.6232 0.018 proportion of segment 1 0.1239 -0.0253 0.5672 0.053 Supply Side: Dannon constant -9.8327 -0.0302 -10.2128 -0.081 -10.3732 0.0542 Yoplait constant -7.9526 -0.045 -8.3507 -0.0907 -9.4799 0.0679 ση1 0.2827 -0.0311 0.2765 -0.0311 0.2667 0.0268 ση2 0.5649 -0.0434 0.5347 -0.0424 0.5178 0.0479 labor cost 1.1134 -0.0134 0.9554 -0.0287 0.9725 0.0133 material cost 0.615 -0.011 0.7859 -0.0228 0.8151 0.0131 Loglikelihood 481.34 491.45 516.38 AIC -940.68 -956.91 -1002.76 BIC -940.05 -956.17 -1001.91

Laundry Detergent. Table 3 presents the estimation results. With the exception of the material costs parameter in the standard logit specification (negative but insignificant), all coefficient estimates have the expected signs. The price coefficients are negative in both specifications. In terms of price sensitivity, there appear to be two equally sized segments (we estimate a proportion of 53.5% for segment 1). Labor cost has the expected positive impact 18

on the price of the product in the standard logit model but is not significantly different from zero in the heterogenous logit specification. The brand-specific constant for Wisk is negative, while the one for Tide is positive, reflecting the strong inherent preference for Tide. Wisk has lower marginal cost (3.46 cents per ounce) than Tide (3.75 cents per ounce). Table 3: Parameter estimates and standard errors for laundry detergent data.

Variable Demand Side: Wisk brand constant Tide brand constant σξ1 σξ2 price (segment 1) price (segment 2) proportion of segment 1 Supply Side: Wisk cost constant Tide cost constant ση1 ση2 labor cost material cost Loglikelihood

No Heterogeneity Coefficient (Std. dev.)

With Heterogeneity Coefficient (Std. Dev.)

−0.3217 0.9444 0.4272 0.4334 −0.7710

(0.0760) (0.0620) (0.0521) (0.1401) (0.0176)

−0.3442 0.9222 0.4074 0.4627 −0.8740 −0.6791 0.5358

(0.0366) (0.0398) (0.0494) (0.1142) (0.0373) (0.0289) (0.0749)

2.6628 2.9299 0.3734 0.3418 0.1169 −0.0889 −562.9528

(0.0543) (0.0589) (0.0413) (0.0301) (0.0506) (0.0734)

2.6647 2.9036 0.4102 0.3203 0.0493 0.0065 −560.4829

(0.0522) (0.0646) (0.0289) (0.0152) (0.0396) (0.0591)

We have now seen that the estimation procedure yields reasonable parameter estimates for both the yogurt and the laundry detergent category. It still remains unclear, however, whether our estimator performs well in general. To study its properties, in the next section we turn to a more thorough investigation through a Monte Carlo experiment.

5

Simulation Analysis

Given the complexity of the proposed algorithm, it is very difficult to determine analytically its properties. We therefore conducted a Monte Carlo experiment: we generated 50 artificial

19

data sets and applied the estimation procedure to each of them. Since the true underlying parameters are known, we can compare our estimates to them and draw conclusions about the performance of our procedure.

Data. We simulated choice data for 114 weeks and 473 households. The assumed ‘true’ parameter values roughly correspond to the ones obtained from a preliminary estimation using scanner panel data and are listed in Table 4. There are two competing brands and an outside good in the market with average shares of 2%, 4%, and 94%, respectively. The way the model is set up, choosing the outside good at time t means not buying at all in week t. That is, for each household, we have 114 observations. The total number of observations is thus 53,922. For the supply side, we use factor price data for labor (average hourly wages of production workers for SIC 209, miscellaneous food and kindred products) and for the key ingredient in the production process, peanuts. We draw the demand and supply shocks from a normal distribution. For the standard deviations, (σξ1 , . . . , σξJ , ση1 , . . . , σηJ ), we set the true values equal to the RMSE from a preliminary 3SLS estimation. The choice and price generation process is as specified in equations (8) and (9).

Monte Carlo results. We obtained the parameter estimates for each of the 50 Monte Carlo samples using the algorithm described in Section 3. Table 4 presents the resulting mean, bias, variance, and mean square error (MSE). The MSE is given by the sum of the squared bias and the variance. In general, the proposed estimation procedure seems to work quite well. Specifically, the variances of the parameter estimates are very small, as expected for a maximum-likelihood based procedure. The magnitude of the biases is large compared to the variances. It is, however, reassuring that the coefficient of interest, namely the price coefficient, is estimated with a very high degree of reliability. The bias is only 0.00354, which is tiny relative to the value of the price coefficient (−0.21) and suggests that our way of dealing with

20

the endogeneity problem is indeed effective. The supply-side parameters (labor and ingredients cost) also show only a small bias. Table 4: Monte Carlo results for proposed algorithm. Variable demand const. 1 demand const. 2 price supply const. 1 supply const. 2 labor cost material cost σξ1 σξ2 ση1 ση2

True Value -2.62 -1.27 -0.21 -13.48 -12.12 2.03 0.27 0.61 0.41 0.75 0.65

Mean -2.49396 -1.14716 -0.21354 -13.18638 -11.86734 2.03272 0.29996 0.49563 0.34294 0.60453 0.48901

Bias -0.12604 -0.12284 0.00354 -0.28362 -0.25266 -0.00272 -0.02996 0.11437 0.06706 0.14547 0.16099

Variance 0.00809 0.00554 0.00004 0.01696 0.01645 0.00013 0.00803 0.00260 0.00075 0.00388 0.05074

MSE 0.02398 0.02063 0.00005 0.09740 0.08028 0.00014 0.00893 0.01568 0.00525 0.02504 0.07666

The performance of the estimator is also excellent when unobserved heterogeneity is considered. We simulated a data set with two equally sized segments differing in their price sensitivity. Table 5 presents the results of the Monte Carlo experiment for this specification. As can be seen from the table, the price coefficients are estimated reliably. Overall, the proposed estimation procedure handles unobserved heterogeneity very well.

Robustness checks. One key assumption we make is that the demand and supply errors are jointly normally distributed. This does not need to be true in reality, so we test the robustness of our procedure to different distributional assumptions. Specifically, we assume that ξ and η are following a mixture of normal and logistic distributions. We give increasingly higher weights to the logistic distribution to study the effects on the performance of our procedure. As Table 6 reveals, the results are fairly robust. As expected, when the normality assumption is satisfied, the MSE is the smallest but even when we draw the errors entirely from a logistic distribution, the accuracy remains very high. 21

Table 5: Monte Carlo results for model with heterogeneity. Variable demand const. 1 demand const. 2 price1 price2 σξ1 σξ2 supply const. 1 supply const. 2 ση1 ση2 labor cost material cost proportion of segment 1

True Value -2.62 -1.27 -0.15 -0.25 0.61 0.41 -13.47 -12.12 0.75 0.65 2.03 0.27 0.50

Mean -2.5948 -1.2540 -0.1517 -0.2511 0.5737 0.3880 -13.4798 -12.1033 0.7402 0.6591 2.0222 0.2760 0.5095

Bias 0.0252 0.0160 -0.0017 -0.0011 -0.0363 -0.0220 -0.0098 0.0167 -0.0098 0.0091 -0.0078 0.0060 0.0095

Variance 0.0064 0.0077 0.0001 0.0014 0.0033 0.0010 0.0042 0.0093 0.0025 0.0025 0.0015 0.0215 0.0106

MSE 0.0070 0.0080 0.0001 0.0014 0.0046 0.0015 0.0043 0.0096 0.0026 0.0026 0.0016 0.0216 0.0107

Table 6: Monte Carlo results for different mixtures of normal and logistic distribution (price coefficient only, true value −0.21).

Bias Variance MSE

normal 0.00085 0.00010 0.00010

0.8*normal -0.00931 0.00005 0.00014

0.5*normal -0.01532 0.00013 0.00036

0.2*normal -0.00717 0.00011 0.00016

logistic -0.00016371 0.000199533 0.000199559

Another important factor affecting the performance of the estimation procedure is the choice of bandwidth (see Section 3). The bandwidth determines the smoothness of the joint density of equilibrium prices and probabilities, i.e., the likelihood function. Too small a bandwidth leads to a likelihood that is not well-behaved and hence makes finding a global maximum very difficult. Too large a bandwidth, however, may cause the likelihood function to differ greatly from the true underlying density of equilibrium prices and probabilities. We examined the sensitivity of the price estimate to the choice of this parameter by looking at bandwidths that are 1/4, 1/2, 2, and 4 times the normal reference rule bandwidth. Table 7 summarizes the results for the price coefficient for two different sets of parameters. One set of parameters

22

was generated as above based on preliminary estimates in the peanut butter category (true value for price is −0.21), the other set of parameters corresponds to the parameter estimates in the laundry detergent category (true value for price −0.77). It appears that the bandwidth obtained by the normal reference rule (equation (13)) performs well. Moreover, the precision of the estimates is not overly sensitive to the choice of the smoothing parameter. Table 7: Comparison of Monte Carlo results for different bandwidths (price coefficient only). N R is the bandwidth computed from the normal reference rule.

True Value −0.21 Bias Variance MSE True Value −0.77 Bias Variance MSE

0.25 × N R

0.5 × N R

NR

2 × NR

4 × NR

-0.01017 0.00005 0.00015

-0.00203 0.00002 0.00002

0.00354 0.00004 0.00005

0.00354 0.00004 0.00005

-0.00209 0.00004 0.00004

0.03411 0.00105 0.00222

0.01356 0.00026 0.00044

0.00094 0.00010 0.00011

-0.01344 0.00023 0.00041

-0.01977 0.00086 0.00125

To summarize, our Monte Carlo simulations demonstrate the ability of the proposed estimation procedure to reliably recover the true parameters of an equilibrium model. In particular, the parameter of interest, namely the price coefficient, is estimated with a very high degree of precision. The conducted robustness checks indicate that our methodology is fairly robust to modifications of the distributional assumptions as well as bandwidth selection.

6

Concluding Remarks

In this article we develop a new likelihood-based methodology for the estimation of structural demand-and-supply models using disaggregate data. Marketing researchers have established a long tradition of estimating random utility models of consumer demand using maximum likelihood methods. Tying a traditional individual-level choice model such as a logit or probit with a supply side specification is a non-trivial task. Simply assuming a joint distribution of prices 23

and probabilities is inconsistent with the equilibrium notion. Furthermore, the nonlinearity of brand choice models makes writing down the joint distribution of equilibrium prices and probabilities implied by the unobserved demand and supply shocks very challenging. We solve these problems by simulating equilibrium prices and probabilities and then using the empirical likelihood of these prices and probabilities to obtain the parameters of the model. Estimating the demand and supply equations jointly deals with the problem of price endogeneity and ensures that we obtain reliable estimates of the price response parameter. Moreover, the estimated structural equilibrium model can be used to perform “what-if” type analyses (Draganska and Jain 2003). We apply the proposed algorithm to both real-world scanner data and to simulated data in order to assess the properties of the estimation method and highlight its merits and limitations. Overall, the new procedure performs very well. It yields estimates of plausible magnitude when applied to individual level choice data in several product categories. The conducted Monte Carlo experiments demonstrate both the accuracy of our method and its robustness. One of the attractive features of our approach relative to previous research considering endogeneity in individual-level models (Villas-Boas and Winer 1999, Villas-Boas and Zhao 2001) is the ability to model explicitly the heterogeneity structure of the population. We specify and estimate a latent class model to incorporate unobserved heterogeneity across households. In its current form, however, our method cannot readily take into account the panel structure of the household-level data. That is, if there is a correlation in the tastes of individual households, our procedure yields a partial likelihood and the estimated standard errors need to be corrected. Extending the proposed methodology to explicitly incorporate the dependencies in households choices over time is an important area for future research.14 On the supply side, one might think about the reasonability of the assumed Nash behavior in prices. Our method does not require any particular assumption about the strategic interactions

24

between firms. A conjectural variation approach or a menu approach to test for different behavioral assumptions could be employed to reveal the nature of competition in the market. This is critical because misspecification of the supply side translates into a misspecified system, thus leading to inconsistent parameter estimates. Future research could also focus on enriching the supply side by explicitly incorporating the channel structure (Villas-Boas 2001, Sudhir 2001). In the current analysis we only consider the endogeneity of prices to illustrate the proposed methodology. Recent studies have suggested, however, that other strategic instruments such as advertising (Vilcassim, Kadiyali and Chintagunta 1999) and product line length (Draganska and Jain 2003) should also be considered endogenous. One fruitful venue for future study would therefore be to apply the estimation procedure developed in this paper to the analysis of other marketing mix instruments. In sum, the present research is a first step towards the estimation of a market equilibrium model with a disaggregate discrete choice model on the demand side and an oligopoly model on the supply side. The proposed estimation procedure explicitly accounts for the price endogeneity problem. It further bears the potential of combining the advantages of simultaneous estimation of market models with recent developments in incorporating richer heterogeneity structures and more flexible error specifications in disaggregate models.

References Ackerberg, D. and Gowrisankaran, G. (2001). Quantifying equilibrium network externalities in the ACH banking industry, working paper, UCLA. Anderson, S., de Palma, A. and Thisse, J. (1992). Discrete Choice Theory of Product Differentiation, MIT Press, Cambridge, MA.

25

Berry, S. (1994). Estimating discrete-choice models of product differentiation, RAND Journal of Economics 25: 242–262. Berry, S., Carnall, M. and Spiller, P. (1997). Airline hubs: Costs, markups and the implications for consumer heterogeneity, working paper, Yale University. Berry, S., Levinsohn, J. and Pakes, A. (1995). Automobile prices in market equilibrium, Econometrica 63: 841–890. Besanko, D., Dub´e, J.-P. and Gupta, S. (2003). Competitive price discrimination strategies in a vertical channel with aggregate data, Management Science 49(9): 1121–1138. Besanko, D., Gupta, S. and Jain, D. (1998). Logit demand estimation under competitive pricing behavior: An equilibrium framework, Management Science 44: 1533–1547. Burnkrant, R. and Unnava, H. R. (1995). Effeccts of self-referencing on persuasion, Journal of Consumer Research 22: 17–26. Chintagunta, P., Dub´e, J.-P. and Goh, K.-Y. (2003). Beyond the endogeneity bias: The effect of unmeasured brand characteristics on household-level brand choice models, Technical report, University of Chicago GSB. Chintagunta, P., Jain, D. and Vilcassim, N. (1991). Investigating heterogeneity in brand preferences in logit models for panel data, Journal of Marketing Research 28: 417–428. Draganska, M. and Jain, D. (2003). Product-line length as a competitive tool, working paper, Stanford University. Dub´e, J.-P. (2003). Discussion of ‘Bayesian analysis of simultaneous demand and supply’, Quantitative Marketing and Economics 1(3). forthcoming.

26

Fader, P. and Hardie, B. (1996). Modeling consumer choice among SKUs, Journal of Marketing Research 33: 442–452. G¨on¨ ul, F. and Srinivasan, K. (1993). Modeling multiple sources of heterogeneity in multinomial logit models: methodological and managerial issues, Marketing Science 12(3): 213–229. Goolsbee, A. and Petrin, A. (2003). The consumer gains from direct broadcast satellites and the competition with cable television, Econometrica . forthcoming. Guadagni, P. and Little, J. D. C. (1983). A logit model of brand choice calibrated on scanner data, Marketing Science 2(3): 203–238. H¨ardle, W. (1990). Applied Nonparametric Regression, Cambridge University Press. Kamakura, W. and Russel, G. (1989). A probabilistic choice model for market segmentation and elasticity structure, Journal of Marketing Research 26: 379–390. Kennan, J. (1989). Simultaneous equation bias in disaggregated econometric models, Review of Economic Studies 56: 151–156. McFadden, D. (1989). A method of simulated moments for estimation of discrete response models without numerical integration, Econometrica 57(5): 995–1026. Nelder, J. and Mead, R. (1965). A simplex method for function minimization, Computer Journal 7: 308–313. Nevo, A. (2001). Measuring market power in the ready-to-eat cereal industry, Econometrica 69(2): 307–342. Pakes, A. and Pollard, D. (1989). Simulation and the asymptotics of optimization estimators, Econometrica 57: 1027–1057.

27

Petty, R. and Cacioppo, J. (1986). Communications and Persuasion: Central and Peripheral Routes to Attitude Change, Springer Verlag. Press, W., Teukolsky, S., Vetterling, W. and Flannery, B. (1993). Numerical Recipes in C: The Art of Scientific Computing, 2 edn, Cambridge University Press. Scott, D. (1992). Multivariate Density Estimation : Theory, Practice, and Visualization, Wiley Series in Probability and Statistics, New York. Silverman, B. (1986). Density Estimation for Statistics and Data Analysis, Chapman & Hall, London. Slade, M. (1995). Product rivalry and multiple strategic weapons: An analysis of price and advertising competition, Journal of Economics and Management Strategy 4: 445–476. Stone, C. (1980). Optimal rates of convergence for nonparametric estimators, Annals of Statistics 8: 1348–1360. Sudhir, K. (2001). Structural analysis of competitive pricing in the presence of a strategic retailer, Marketing Science 20(3): 244–264. Viard, B., Polson, N. and Gron, A. (2002). Likelihood based estimation of nonlinear equilibrium models with random coefficients, working paper, Stanford University. Vilcassim, N., Kadiyali, V. and Chintagunta, P. (1999). Investigating dynamic multifirm market interactions in price and advertising, Management Science 45: 499–518. Villas-Boas, M. and Winer, R. (1999). Endogeneity in brand choice models, Management Science 45: 1324–1338. Villas-Boas, M. and Zhao, Y. (2001). The ketchup marketplace: Retailers, manufacturers and individual consumers, working paper, UC Berkeley. 28

Villas-Boas, S. (2001). Vertical contracts between manufacturers and retailers: An empirical analysis, working paper, UC Berkeley. Wooldridge, J. (2002). Econometric Analysis of Cross Section and Panel Data, MIT Press. Yang, S. and Allenby, G. (2000). A model for observation, structural, and household heterogeneity in panel data, Marketing Letters 11: 137–149. Yang, S., Chen, Y. and Allenby, G. (2003). Bayesian analysis of simultaneous demand and supply, Quantitative Marketing and Economics 1(3): 1–25. forthcoming. Yatchew, A. (1998). Nonparametric regression techniques in economics, Journal of Economic Literature 36(2): 669–721.

29

Notes 1 The

authors wish to thank Arie Beresteanu, Ulrich Doraszelski, Jean-Pierre Dube, Gautam Gowrisankaran, Charles Manski, Mike Mazzeo, Brian Viard and participants at the 1999 Marketing Science conference in Syracuse for their helpful comments and suggestions. Mariusz Rabus provided expert research assistance for this project. 2 An

anonymous referee drew our attention to the fact that our assumption is somewhat similar to what Yang and Allenby (2000) call ‘observation’ heterogeneity. Yang and Allenby (2000) define this term in the context of a latent class model as a specification in which the latent class probabilities depend on observable covariates. This contrasts with ‘household’ or ’structural’ heterogeneity, which entails dependence over time. 3 The

main drawback of a continuous distribution of consumer heterogeneity is its computational complexity, since we need to numerically evaluate multidimensional integrals. While this is also true in standard models (e.g., Berry et al. (1995)), our estimation algorithm is already computational intense, so we prefer to work with a discrete distribution. 4 For

a lucid discussion of this approach, see Dub´e (2003).

5 In

small samples, most kernel density estimators are biased. Our Monte Carlo results indicate that this does not impair the ability of our procedure to recover the structural parameters of the equilibrium model. If unbiasedness is desired, one can use so-called higher-order kernels, which are computationally more demanding. 6 Another

possibility to obtain a smooth likelihood function has been explored by Ackerberg and Gowrisankaran (2001). The authors make the auxiliary assumption of normal measurement error that allows them to express the likelihood function in terms of the normal density. A similar assumption has also been employed by Viard, Polson and Gron (2002) who estimate an equilibrium model using Bayesian methods (Markov Chain Monte Carlo techniques). These approaches may be problematic if the underlying density of the endogenous variables differs significantly from a normal density. 7 For

a thorough treatment the interested reader is referred to Yatchew (1998).

8 Details 9 In

on the estimation procedure available from the authors upon request.

the laundry detergent category, we use data for 107 weeks.

10 Yoplait

only offers single-serving size yogurt. Dannon also carries 16oz and 32oz of plain and vanilla yogurt in addition to single-serving size. It is often argued that these two particular flavors are used for cooking purposes and constitute a different market. 11 Details 12 We

are available from the authors upon request.

are grateful to an anonymous referee for bringing this point to our attention.

30

13 These 14 In

numbers are computed from the standard logit specification with no heterogeneity.

a recent article, Yang et al. (2003) propose a Bayesian approach to this problem.

31

Suggest Documents