Welfare Gains from Optimal Pollution Regulation

Welfare Gains from Optimal Pollution Regulation Jose Miguel Abitoy November 9, 2012 Abstract Successful implementation of pollution regulation often ...
Author: Damon Lloyd
1 downloads 1 Views 501KB Size
Welfare Gains from Optimal Pollution Regulation Jose Miguel Abitoy November 9, 2012

Abstract Successful implementation of pollution regulation often requires redistributing a portion of the bene…ts back to …rms who incur abatement costs. When …rms have private information on their costs, they have an incentive to overstate these costs and demand higher compensation. Optimal pollution regulation in this environment sacri…ces allocative e¢ ciency to reduce information rents. I measure the gains from optimal pollution regulation by empirically examining the e¤ect of sulfur dioxide emissions regulation on electric utilities. These electric utilities also face economic regulation, and I exploit this institutional detail. I derive estimates of marginal abatement costs from the cost of jointly producing electricity and emissions, allowing for time-varying unobserved heterogeneity to capture cost e¢ ciency. Cost e¢ ciency consists of exogenous (intrinsic type) and endogenous (managerial e¤ort) components which are private information of the …rm. To separately identify these components, I model economic regulation as a signaling game of auditing. I show that a particular equilibrium exists where the …rm does not exert e¤ort during the “rate case”, but it exerts a positive level of e¤ort afterwards. I provide empirical evidence for the plausibility of this equilibrium using cost and rate case data. This equilibrium generates exclusion restrictions that are used to estimate parameters of the cost function and disutility of e¤ort. I show that the type distribution can be nonparametrically identi…ed using deconvolution methods, and estimate this distribution via a smoothed discrete approximation. Finally, I conduct counterfactual welfare simulations. I …nd that annual welfare gains from optimal pollution regulation relative to a uniform emission standard range from $32 million to $155 million per electric utility, or about 10% to 47% of combined electricity generation and abatement costs. Implementing the optimal form of regulation is di¢ cult, if not impossible, so I examine simpler regulatory regimes. A class of regimes with uniform emission taxes captures 52% to 80% of these gains. Job Market Paper. I would like to thank my advisors Aviv Nevo, David Besanko and Robert Porter for all of their help and guidance. I also thank Mark Chicu, Daniel Diermeier, Jose Espin Sanchez, Igal Hendel, Matt Masten, Tiago Pires, Mike Powell, Min Ren, William Rogerson, Kosuke Uetake, Michael Whinston and seminar participants at Northwestern. Data acquired from SNL Financial was partly funded by the TGS Graduate Research Grant and by the Center for the Study of Industrial Organization (CSIO) at Northwestern University. y Department of Economics, Northwestern University. Email: [email protected].

1

1

Introduction

Successful implementation of pollution regulation often requires redistributing a portion of the bene…ts back to …rms who incur abatement costs. For example, in the US Acid Rain Program about $600 million to $1.8 billion worth of emission permits were given to electric utilities for free, instead of being auctioned. This type of redistribution is not without welfare costs. By giving away permits for free, the policy-maker forgoes revenues that can be used to reduce distortionary taxes or fund other productive activities (Goulder et al, 1997). A further policy constraint is that …rms may have private information about their abatement cost. Firms can exploit this informational advantage and extract information rents by overstating their costs and demanding higher compensation. La¤ont (1994) uses the framework of incentive regulation (e.g. La¤ont and Tirole, 1993) to characterize the optimal form of pollution regulation in this environment.1 The key insight is that when information rents are costly, it may be optimal to distort allocative e¢ ciency to decrease information rents. Thus, the optimal form of regulation may involve abatement levels that do not equate the marginal damages from emissions with marginal abatement costs. Despite the simplicity of this insight, policies inspired by incentive regulation have rarely been implemented in pollution regulation, and in economic regulation in general. The design and implementation of such mechanisms require a lot on the part of the regulator in terms of information gathering, rigorous auditing and sophisticated analyses (Joskow, 2008; Kahn, 1988). Moreover, uncertainty over the actual bene…ts and costs of these policies, and the subsequent negative political and economic consequences in cases where such attempts are unsuccessful, make it di¢ cult to convince policy-makers to adopt untested mechanisms. My paper addresses the following questions. How much do we gain by implementing optimal pollution regulation relative to a uniform emission standard? Can more practical alternatives approximate these gains? The paper focuses on sulfur dioxide (SO2 ) emissions regulation of electric utilities in the US to empirically answer these questions. An interesting institutional feature of my setting is that polluting sources were facing both pollution and economic regulation. This feature o¤ers an excellent setting to study issues of redistribution and asymmetric information. Pollution regulation comes in the form of the Acid Rain Program which is administered by the Environmental Protection Agency at the federal-level. Economic regulation on the other hand is implemented by state-level public utility commissions in charge of regulating the price of electricity. Since state utility commissions are directly responsible for providing adequate compensation to electric utilities, commissions care about the impact of pollution regulation on the cost of producing electricity. Commissions can then let this concern be heard by state legislators and 1

See also Lewis (1996). Spulber (1998) shows that when information rents are too large, the policy that maximizes

allocative e¢ ciency may not even be implementable.

2

in‡uence the design of pollution regulation. Although some of the windfall gains from pollution regulation may be passed on to consumers through lower electricity prices (Schmalensee and Stavins, 2012), the part of excess payments due to information rents do not get passed on if the economic regulator does not have the same information as the …rm. In computing welfare under optimal pollution regulation, I consider a social planner who is in charge of both pollution and economic regulation. Economic regulation makes explicit the need to design a pollution regulatory regime that adequately compensates the …rm. I use the static regulatory framework of La¤ont (1994) to characterize optimal pollution regulation. The size of distortions from allocative e¢ ciency depends on the distribution of marginal abatement costs across the possible unobserved types of the …rm. If, given the same level of abatement, di¤erences in marginal abatement costs are large, the incentives for low cost …rms to claim to be of high cost rise much faster as abatement is increased. In this case, large …rst order gains in welfare are achieved by inducing high cost types to abate less compared to the allocatively e¢ cient level. These …rst order gains are achieved at the expense of second order losses. Thus, the gains from optimal pollution regulation relative to other regulatory regimes depend on the distribution of marginal abatement costs. My main task is to estimate the distribution of marginal abatement costs from the data. I estimate marginal abatement costs of electric utilities using data from 1988-1999. My focus is on the cost of fuel-switching, which was the popular mode of abatement during the time period. Fuel-switching directly impacts the cost of producing electricity and marginal abatement costs can be measured as the increase in the cost of producing electricity from an incremental decrease in emission rates. Thus, I can study and use data on the cost of producing electricity to infer what marginal abatement costs are. Formally, the main object of analysis is a multiproduct cost function which captures the cost of jointly producing electricity and emissions (or abatement). In estimating …rms’ multiproduct cost functions, I allow for time-varying unobserved heterogeneity to capture unobserved cost e¢ ciencies. I model the …rm’s cost e¢ ciency as having a component that is exogenous (intrinsic type) and a component that is endogenous (managerial e¤ort), and these are private information of the …rm. While it is possible to estimate the …rm’s cost e¢ ciency solely using cost and operations data, this is not enough to decompose cost e¢ ciency into its type and e¤ort components. A …rm with high realized cost can either be a …rm that is intrinsically ine¢ cient or a …rm that did not exert e¤ort. We need additional information that explicitly links a …rm’s observed cost with its unobserved intrinsic type and chosen e¤ort. I use a model of economic regulation (rate regulation) to provide this link. Although my paper’s focus is pollution regulation, I exploit rate regulation to link …rms’observed behavior with primitives. Rate regulation a¤ects …rms’incentives to manage their electricity generation costs, which directly ties with abatement costs through the multiproduct cost function.

3

I model rate regulation as a signaling game of auditing, where the …rm provides information about its costs in a rate case, and the regulator decides on the …rm’s allowed revenues based on this information. I show that there exists an equilibrium where the …rm has no incentive to exert e¤ort during the rate case, and a positive optimal level of e¤ort once the case concludes.2 Therefore under this equilibrium, the e¤ort component does not appear in the …rm’s cost e¢ ciency during the rate case. The wedge between cost e¢ ciencies during and after the rate case reveals the …rm’s chosen e¤ort. I can then infer the …rm’s “disutility” from exerting e¤ort from the chosen level of e¤ort after the rate case. Since equilibria where the …rm exerts positive e¤ort during the rate case also exist, I provide empirical evidence to support the plausibility of the “no-e¤ort” equilibrium using cost and rate case data. First, I …nd that costs and heat rates (i.e. amount of fuel burned per unit of electricity produced) are higher during the rate case. Second, I provide evidence that the regulator’s auditing strategy under the no-e¤ort equilibrium obtains in the data. I use the properties of the no-e¤ort equilibrium to identify and estimate the empirical model from which I estimate …rms’primitives. I impose parametric assumptions on …rms’cost function and disutility of e¤ort in the empirical model. In computing welfare under di¤erent regulatory regimes, I need to know what the underlying costs and disutilities are for arbitrary values of emission rates and e¤ort levels. However, I do not impose distributional assumptions on …rms’unobserved intrinsic types. The distribution of unobserved types determines the distribution of marginal abatement cost and therefore is an important ingredient in the welfare analysis. Because e¤ort is chosen by the …rm and cost e¢ ciency is unobserved by the econometrician, there is an endogeneity problem when estimating the parameters of the empirical model.3 The no-e¤ort equilibrium provides information on what cost e¢ ciency is during di¤erent time periods and events (i.e. rate case and non-rate case years). I can then use similar techniques from the dynamic panel literature to identify and estimate the parameters. I pose the problem of identifying the unobserved type distribution as a measurement error problem with repeated measurements and apply the result of Kotlarski (1967) to establish nonparametric identi…cation. Finally, I estimate the type distribution using the smoothed discrete approximation developed by Hausdor¤ (1923) and applied by Beran and Hall (1992). I examine welfare under di¤erent regulatory regimes given the estimated primitives. Welfare gains from optimal pollution regulation are computed relative to the uniform emission standard that maximizes allocative e¢ ciency. Optimal pollution regulation can be theoretically implemented using type-dependent 2

Incentives to exert e¤ort after the rate case is a common feature in models with a regulatory lag, e.g. Baumol and

Klevorick (1970), Bailey and Coleman (1971), and Pint (1992). Regulatory lag here refers to the time between rate cases rather than the duration of the case. 3 Firms’intrinsic type may also be correlated with electricity output and prices of procured fuel. This potential correlation is another source of endogeneity.

4

transfers and type-dependent emission tax rates. Because this is di¢ cult to implement in practice especially when …rms are su¢ ciently heterogeneous, I estimate welfare from a uniform emission tax regime and a hybrid regime to see how much of the welfare gains from optimal pollution regulation can be captured by these simpler alternatives. The hybrid regime is an emission tax regime that allows …rms to opt-out and join a uniform emission standard. While the hybrid regime sacri…ces allocative e¢ ciency, it allows the social planner to decrease information rents. If the increase in welfare due to lower information rents out-weighs the loss due to distortions in allocative e¢ ciency, then opt-out improves welfare. When damages from SO2 emissions are valued at $100 per ton, the welfare gains from optimal pollution regulation relative to an e¢ cient uniform emission standard are about $32 million per …rm, or 10% of the combined variable cost of electricity generation and abatement. Welfare gains rise when abatement is valued more. When damages are $1000 per ton, annual welfare gains rise to $155 million per …rm. Finally, I …nd that simpler alternatives capture a large part of these gains. The uniform emission tax and hybrid regimes capture from 52% to 80% of the welfare gains from optimal pollution regulation. The hybrid regime out-performs the uniform emission tax regime when the cost of public funds is high, i.e. when reducing information rents is relatively more valuable. The paper is organized as follows. The next section provides a background of the institutions. In section 3, I start with the de…nition of welfare to lay out the things we need to perform the welfare comparisons. I then discuss the model of rate regulation and characterize its equilibria. Section 4 describes the data. I also present evidence to support the particular equilibrium that will be useful for identi…cation and estimation. Section 5 is the main empirical section of the paper and it starts with the empirical model set-up. Identi…cation is tackled in subsection 5.1, followed by estimation and a discussion of the results. Section 6 contains the counterfactual welfare exercise. The …nal section concludes. Related literature My paper is most related to the line of empirical regulation literature pioneered by Wolak (1994). Wolak (1994) and Brocas et al (2006) use the normative models of Baron and Myerson (1982) and Besanko (1985) to provide a link between observed behavior and the …rm’s private information. This approach assumes that the actual regulatory institutions can be modeled “as if” the optimal form of regulation was being implemented by the regulator. The optimal mechanism characterizes a mapping between the …rm’s private information and observed regulatory variables (e.g. price and rate of return) which can then be inverted to identify and estimate the …rm’s primitives. Perrigne and Vuong (2011) formalize this identi…cation strategy for the normative model of La¤ont and Tirole (1986). One issue with using a normative model is that it assumes a highly sophisticated regulator that can design and

5

commit to the optimal mechanism.4 For example, in order to derive the optimal mechanism in the La¤ont and Tirole (1986) model, the regulator needs to know the exact functional form for the e¤ort disutility function. The regulator then designs and o¤ers a set of contracts, and it is assumed the regulator can commit to these.5 My approach is to directly model the rate case regulatory institution to provide the link between observed behavior and the …rm’s primitives. I build a signaling model of regulation where the regulator takes an action after the …rm provides information. Thus, I do not require the regulator to design and commit to a particular mechanism before the …rm moves. Gagnepain and Ivaldi (2002) do not rely on a normative model and instead exploit variation in actual regulatory regimes to estimate welfare in the French urban transport industry. My paper di¤ers from their identi…cation strategy in two ways. First, the …rms in their setting either face a …xed-price or a cost-plus contract. Under the assumption that the assignment to a regulatory regime is exogenous, the variation in regimes in the data allows identi…cation of …rms’type and disutility of e¤ort. In my setting, …rms face the same regulatory regime. I exploit the induced equilibrium behavior of …rms across time to get the variation I need. Second, I do not impose distributional assumptions on the type distribution. The type distribution is nonparametrically identi…ed and ‡exibly estimated. The paper contributes to the empirical literature on pollution regulation. The closest paper is Carlson et al (2000). They estimate the cost-savings from Phase I of the Acid Rain Program (ARP) relative to command-and-control regimes (e.g. uniform emission standard). The sample of electric utilities I study own the set of plants that were under Phase I. Similar to their paper, I estimate marginal abatement costs from fuel-switching by estimating a multiproduct cost function. However, Carlson et al (2000) ignore economic regulation in estimating marginal abatement costs which may lead to biased estimates (Wolak, 1994).6 Moreover, my focus is on welfare and optimal regulation rather than cost-savings alone. My identi…cation and estimation strategy for the empirical model’s parameters has its roots in the dynamic panel literature (see Arellano and Honoré (2001) and Arellano (2003)). The key idea is to model how unobserved heterogeneity evolves and to use transformations of the data so that the unobserved heterogeneity does not appear in the estimating equations. The equilibrium I characterize generates restrictions on the evolution of unobserved heterogeneity. I use deconvolution techniques to nonparametrically identify the distribution of intrinsic types. De4

Although Perrigne and Vuong (2011) allow observed regulatory variables to deviate from the one speci…ed by the optimal

mechanism, this deviation should be unsystematic, i.e. unrelated to the …rm’s primitives. 5 Baron and Besanko (1984) introduce auditing in the Baron and Myerson (1982) model which brings the model closer to what happens in a rate case. The commitment assumption is crucial in this model otherwise the regulator does not have an incentive to audit the …rm and the optimal auditing policy breaks down. 6 Fowlie (2010) provides evidence that rate regulation induce …rms to choose more capital-intensive abatement options in the context of NOx emissions regulation. I look at the e¤ect of rate regulation on abatement costs rather than the choice of abatement method.

6

convolution methods have been applied in measurement error models (e.g. Li and Vuong (1998) and Schennach (2004)), in panel data and error components models (e.g. Horowitz and Markatou (1996); Evdokimov (2008, 2010); Bonhomme and Robin (2010); and Arellano and Bonhomme (2012)), and in the auctions literature (e.g. Li et al (2000); Asker (2010); and Krasnokutskaya (2011)). In contrast to this literature, I do not use an inverse Fourier transform to estimate the type distribution. Instead, I rely on the smoothed discrete approximation developed by Hausdor¤ (1923) to solve the classical problem of moments (Shohat and Tamarkin, 1943). The idea is to approximate the underlying distribution by a discrete distribution whose probability mass is a linear combination of the moments of the underlying distribution. Beran and Hall (1992) apply Hausdor¤’s (1923) approximation to estimate the distribution of random coe¢ cients without imposing distributional assumptions on the error term. The welfare exercise I perform is similar to the exercise in the empirical price discrimination literature, e.g. Leslie (2004), Miravete (2007), Villas-Boas (2009), Hendel and Nevo (2012), and Lazarev (2011), where the fully optimal pricing strategy is compared to simpler ones. Finally, the hybrid regime I construct can be seen as a binary menu in the spirit of Rogerson (2003) and Chu and Sappington (2007). These two papers use numerical examples to examine the performance of the simpler binary menu relative to the fully optimal menu. My paper does this exercise empirically.

2

Institutional background

I …rst provide an overview of the investor-owned electric utility and how the utility simultaneously produces electricity and emissions. I then brie‡y discuss the history of SO2 emissions regulation. Finally, I describe rate regulation and what goes on in a rate case. Although the paper is about pollution regulation, accounting for the existing form of economic regulation is an integral part of my research strategy. Electric utilities are vertically-integrated monopolists regulated by the State Public Utility Commission (PUC). They own and operate the generation, transmission and distribution of electricity within a given service area (typically within a state but can sometimes cross state boundaries). The generation sector is composed of multiple plants that transform energy sources such as fossil-fuels and nuclear energy into electricity. The transmission sector is responsible for moving electricity from plants to local distribution centers using high-voltage power lines. The distribution sector is then responsible for delivering electricity to end-users. My paper focuses on the operating expenses related to generating electricity from fossil-fuels, which are about 40% of total operating expenses. An electric utility owns multiple plants and these plants di¤er depending on the type of fuel they burn. The electric utilities I consider all own coal, oil and natural gas plants.7 Coal plants are typically 7

Electricity output of the utility can also come from nuclear plants and from other plants not owned by the utility

7

baseload plants since these plants run continuously and cost-e¤ectively meet some minimum level of electricity demand that the utility expects. In contrast, natural gas are peaking plants which are only turned on and utilized during times when demand is high and baseload plants are inadequate to meet demand. The electric utilities in my sample primarily rely on coal to produce electricity. The average ratio of coal consumption to total fuel burned is 92%. Figure 1 illustrates the electricity generation process in a coal-…red power plant. First, coal is fed into mills and pulverized into …ne powder. This …ne powder is mixed with air and then blown into the boiler’s furnace and burned. At the same time, water ‡ows through tubes inside the boiler. The burned coal releases heat which then turns water inside the boiler into high pressure steam. The high pressure steam rotates the turbine blades and the attached generator converts mechanical energy into electrical energy. The coal-burning process also produces by-products such as ash and emissions. Ash is collected while emissions ‡ow through the plant’s stacks and into the atmosphere. Figure 1: Coal-…red power plant

Source: http://edu.glogster.com/glog.php?glog_id=15469719

Coal-…red plants account for 65% of SO2 emissions (Environmental Protection Agency, 2001). Coal contains sulfur and SO2 is released to the atmosphere as a by-product when the coal is burned. Sulfur content ranges from about 0.2 pounds per heat input (lbs/MMBtu)8 to about 7 lbs/MMBtu (Perry (purchased power). I exclude these sources from my cost measure and only focus on output from coal, oil and natural gas plants. 8 Roughly, a British thermal unit (Btu) is the amount of energy required to increase the temperature of 1 lb of water by 1 degree Fahrenheit. One MMBtu is equal to one million BTUs.

8

et al, 1997) and coal used for fuel is generally categorized either as bituminuous or sub-bituminuous. Bituminuous coal tends to have a higher heat content but also high sulfur content compared to subbituminuous coal. There is typically a tradeo¤ between heat and sulfur content so absent pollution regulation, plants tend to burn coal with higher sulfur content. Distance of the plant from coal mines is another factor that determines coal choice since transportation costs are a signi…cant component of delivered prices. The dirtiest plants in terms of SO2 are those that are located far from sources of lower sulfur coal.9 Two primary forms of SO2 emissions abatement are fuel-switching (or blending), and installation of a ‡ue-gas desulfurization (FGD) unit or scrubber. Fuel-switching involves using coal with lower sulfur content or blending di¤erent types of coal with varying sulfur contents. This form of compliance has a direct impact on electricity production costs. Lower sulfur coal produces less heat, hence more coal has to be burned to produce the same quantity of electricity. As a second form of compliance, a plant can install an FGD which is an end-pipe control technology installed near the plant’s emission stacks. The plant can still burn high sulfur coal, and the FGD will scrub SO2 from the emissions stream. Although installing a scrubber can also a¤ect the cost of producing electricity by lowering fuel e¢ ciency of the plant (Fabrizio et al, 2007), capital and installation costs are the main components of abatement cost and is less captured by the cost of producing electricity. I focus on fuel-switching as an abatement strategy and measure marginal abatement cost as the e (q; s) is increase in the cost of producing electricity for an incremental reduction in emission rates. If C

the cost of producing electricity q given an emission rate of s, then marginal abatement cost in dollars per lbs/MMBtu is10

M AC =

e (q; s) @C > 0. @s

Fuel-switching is the popular abatement method during my sample period (1988-1999). In my sample, there are only 15 plants out of about 150 that newly-installed an FGD. Plants with FGDs represent only 20% of all the plants. This number includes plants that installed FGDs to satisfy SO2 regulations that were in place before Title IV of the Clean Air Act Ammendements of 1990. The share of abatement from fuel-switching during this period ranged from 54% to 60% (Ellerman and Montero, 2007, Table 5). Utility-level di¤erences in productivity and cost e¢ ciency depend on the portfolio of plants it owns and the manpower involved to run these plants. While I focus on overall utility-level cost e¢ ciency, an 9

Rail deregulation and falling delivered prices of sub-bituminuous coal from the Powder River Basin (PRB) made this

type of coal more competitive. However Ellerman et al (1990, p. 89) note that although the competitiveness of PRB coal led to an overall decrease in contracted prices of coal, long-term contracts continued delivering high sulfur coal. 10 This measure of marginal abatement cost can be converted to per-ton terms by using information on the amount of fuel burned (in MMBtu).

9

important driver of di¤erences of cost e¢ ciencies across …rms is the individual e¢ ciencies of the plants they own. Because fuel expenses make up 75% of operating expenses (excluding capital), an important aspect of plant-level e¢ ciency is fuel e¢ ciency. More importantly, fuel e¢ ciency directly impacts abatement costs when a signi…cant part of emission reductions come from fuel-switching. Di¤erences in fuel e¢ ciency can be driven by factors related to manpower. At the plant-level, Bushnell and Wolfram (2007) document di¤erences in plant operator skill and e¤ort levels that lead to signi…cant di¤erences in plant e¢ ciency. While some processes are automated, activities such as controlling the rate at which coal mills feed pulverized fuel to burners, adjusting the mix of air and fuel in the mills, and operating soot blowers in boilers crucially depend on the plant operator’s skill and e¤ort levels, especially at coal-…red plants. Despite the impact on plant e¢ ciency of the “operator e¤ect”, salaries of plant operators are not commensurate to the cost di¤erences induced by plant e¢ ciency, and managers have rarely instituted personnel policies directly aimed to improve operator e¢ ciency. The authors remark that one reason for such a lack of policies is that existing economic regulation does not provide adequate incentives to the …rm and its managers to improve e¢ ciency. Another dimension where “e¤ort” can a¤ect operating costs is via fuel procurement. H. S. Chan et al (2012) …nd evidence that restructuring lowered fuel procurement costs by about 6%. The idea is that rate regulation may not be providing enough incentives for the …rm’s managers to …nd the best price or to renegotiate long-term contracts.

2.1

SO2 emissions regulation

SO2 produces sulfates when emitted in the atmosphere and these particles can lead to heart and lung disease (EPA, 2009). SO2 is also a precursor of acid rain which has adverse e¤ects on the eco-system.11 Ellerman et al (1990, Ch 2) provide a detailed summary of the political history of SO2 emissions regulation. I highlight some interesting points in what follows. The traditional form of pollution regulation is command-and-control where the regulator either sets a …xed uniform upperbound on the emission rate of …rms (uniform emission standard) or requires …rms to install speci…c control technologies (technology mandate). The Clean Air Act Ammendments (CAAA) of 1970 established the New Source Performance Standards (NSPS) as a direct form of SO2 emissions regulation. NSPS required new coal-…red plants to have an emission rate below 1.2 lbs/MMBtu which can be met by burning lower sulfur fuel. Older plants were not subjected to this requirement but it was expected that these plants would be retired 11

Acid rain is formed when SO2 is emitted in the atmosphere and mixed with water, oxygen and oxidants to form acidic

compounds that eventually fall back to the earth (National Acid Precipitation Assessment Program, 2005). Acid rain increases the acidity of lakes and other bodies of water, leads to the degradation of forests and soil quality, and damages structures (EPA, 2007).

10

in the near future. The CAAA was further ammended in 1977 and essentially required new plants to install scrubbers despite already meeting the NSPS emission rate. Old plants were again shielded from this requirement. However the expected retirements never materialized. By 1985, 83% of emissions from power plants came from these exempted old plants. Concerns about the adverse e¤ects of Acid Rain on the eco-system served as impetus to enlarge the scope of SO2 emissions regulations to coal-…red plants that were not subject to NSPS. Recognizing plants have heterogenous abatement capabilities and that …rms have better information on what these capabilities are, policy-makers have moved from the one-size-…ts-all regime to a decentralized, marketbased regime. This led to the creation of the Acid Rain Program (ARP) under Title IV of the Clean Air Act Ammendments of 1990. Firms were required to hold emission permits for each ton of emission and these permits can be traded in a market. While generally lauded as a success (G. Chan et al, 2012), the legislative history of ARP illustrates that implementation of the program largely hinged on the ability to redistribute the bene…ts of abatement and compensate a¤ected polluting sources via freely allocated initial permits (Joskow and Schmalensee, 1998; Ellerman et al, 1990 Ch 3; G. Chan et al, 2012). Around 6 million permits were grandfathered (Joskow and Schmalensee, 1998), which had a value of about $600M to $1.8B. This type of redistribution has its own costs since forgone revenues from grandfathered permits could have been “recycled” and used to reduce distortionary taxes elsewhere in the economy (Goulder et al, 1997). This issue leads to debates on whether the government should grandfather emission permits or sell them in an auction (see for example, Cramton and Kerr (2002)).

2.2

Rate case

The traditional form of economic regulation is rate regulation.12 Rate regulation is primarily conducted within a rate case. The rate case is a quasi-judicial proceeding whose main goal is to set the revenue requirement, which forms the basis for the regulated prices to charge consumers. The revenue requirement is the total amount that needs to be collected from consumers to compensate the …rm for providing services. It is the sum of operating expenses and the return on the rate base (RRB), which is the monetary value assigned to the …rm’s invested capital (rate base) multiplied by an allowed rate of return. RRB can be thought of as the utility’s pro…t over and above its operating costs. The rate case serves as a platform for the …rm to provide information about its operating cost and environment to the regulator (public utility commission or PUC), who then decides on what revenue requirement to authorize. The case is typically initiated by the …rm although the regulator, urged by consumer groups, can also initiate a case. A hearing takes place where the …rm and concerned parties (e.g. 12

The traditional form of regulation is also sometimes called rate-of-return regulation or cost-of-service regulation.

11

consumer interest groups) participate and provide testimony on the rationale of the proposed changes and the potential impacts these may have on consumer welfare. The …rms (and its experts), consumer groups, and commission sta¤ testify to support their position and to refute opposing arguments. A discovery phase also occurs where bodies of facts and data are presented. If a settlement between concerned parties is not reached, the PUC commissioners decide on the case. The decision consists of the approved revenue requirement which often di¤ers from the initial proposal of the …rm. In theory, the debate and disagreement in rate cases revolve around these three elements: operating expenses, the rate base, and the rate of return. In practice, major rate cases focus on the determination of the rate base and especially the rate of return. Reported operating expenses are typically passed through as long as these abide certain accounting rules. To have a ‡avor of what goes on in a rate case, I summarize a few rate cases in the appendix. These cases come from written reports prepared by the Regulatory Research Associates (RRA). Consistent with Alt’s (2006, p. 27) guide to major rate cases, most of the disallowances in expenses are actually accountingrelated adjustments. A typical expense that is disallowed concerns depreciation of the …rm’s …xed assets. Presumably, it is harder to …nd strong, admissibile evidence that the …rm operated ine¢ ciently, while deviations from accounting adjustment rules are just more tangible. In terms of the rate base, the PUC may disallow certain assets if they do not satisfy the “used and useful” criterion. For example, in the case involving Gulf Power and the Florida PUC in the appendix, the …rm’s stake in a plant was disallowed because the PUC concluded that the …rm already has enough capacity. The sample cases in the appendix provide examples of how the authorized rates of return are reached. The …rm starts with a proposed rate of return, predicated on a proposed capital structure, cost of debt, and return on equity. The …rm presents witnesses to support its proposal. The PUC sta¤ performs its own research and presents what the rates should be based on its …ndings. Typically the commision sta¤ reports a range of rates of return. The PUC commissioners examine the …rm’s and sta¤’s arguments and …nally vote on what rate to authorize. The PUC can punish the …rm for “unethical or illegal” activities by imposing a deduction on the …rm’s rate of return (see Gulf Power case in the appendix). Thus, potentially the PUC can use the rate of return as an incentive for the …rm to operate e¢ ciently. The model I present in the next section allows the regulator to use the authorized rate of return as an incentive device for the …rm to operate e¢ ciently. Whether the regulator actually uses this device is an empirical question (see section 4).

12

3

Model

This section presents the model of rate regulation. Speci…cally, I model the rate case as a signaling game of auditing. Before discussing the model of rate regulation, I take a step back and talk about welfare in the social planner’s problem. Ultimately I want to compute welfare under the optimal form of pollution regulation, which is a counterfactual. The de…nition of welfare tells us what elements are necessary for this computation. The purpose of the rate regulation model is to rationalize the observed data, which allows me to back out these elements. Consider a social planner whose responsibility encompasses both pollution and economic regulation. The social planner is the combination of the pollution regulator (Environmental Protection Agency) and the economic regulator (Public Utility Commission). The planner faces a (unit mass) population of electric utilities, each endowed with a type ( ; R). The variable

a¤ects the …rm’s operating cost of

producing electricity and emissions (abatement), while R is the …rm’s capital costs. I assume and R 2 [0; RU ], where

U

2 [0;

U]

and RU are …nite upperbounds. I also assume electric utilities’ types are

drawn independently from a joint probability distribution F.

The goal of the social planner is to maximize expected welfare and I assume regulation is static.

Following La¤ont (1994), I de…ne (expected) social welfare as Z W = fV (q ( ; R)) D (s ( ; R)) (1 + ) t ( ; R) +

( ; R)g dF

(1)

where V (q ( ; R)) is the gross consumer surplus from electricity produced by …rm ( ; R), i.e. q ( ; R); D (s ( ; R)) is the pollution damage given emission rate s ( ; R); t ( ; R) is a lump-sum transfer paid to the …rm;

is the social cost of public funds; and

( ; R) is the …rm’s pro…t. Thus welfare is the sum of

consumer and producer surplus, taking into account that transfers to the …rm are funded by distortionary taxes to consumers. The planner decides on quantities and transfers to maximize welfare. The pro…t of …rm ( ; R) is given by ( ; R) = t ( ; R)

[exp (

e ( ; R)) C (q ( ; R) ; s ( ; R)) +

(e ( ; R)) + R] .

(2)

The term in square brackets is the …rm’s total economic cost, which is composed of three elements. First, the operating cost of producing electricity q and emission rate s is given by exp ( e

e) C (q; s) where

0 is managerial e¤ort. Second,

(e) captures the disutility from managerial e¤ort. I assume the …rm

and its managers are one entity so

(e) appears in the …rm’s total cost. Finally R is the …rm’s capital

cost. In order to evaluate welfare under di¤erent counterfactual regulatory regimes (including the optimal one), I need to …gure out how …rms behave when facing regulation. The elements required are the 13

distribution of types F, the disutility function

( ) and the (baseline) cost function C (q; s). To identify

these elements from the data, I exploit the fact that electric utilities were subject to economic regulation. Although the focus of the paper is pollution regulation, the model of economic regulation allows me to back out the required primitives from …rms’observed behavior, and estimate the distribution of marginal abatement costs. It is therefore important to use a model that realistically captures the actual form of regulation.

3.1

Rate regulation

As discussed in section 2.2, rate regulation is carried out through the rate case where the electric utility and the regulator (i.e. public utility commission or PUC) sets the revenue requirement. The revenue requirement is the amount of revenues the …rm is allowed to collect from consumers and regulated prices are based on this amount. The revenue requirement is the sum of …rm expenses and the return on the rate base (RRB). RRB is equal to a rate of return multiplied by the rate base, which is the value of invested capital that the …rm is allowed to earn pro…ts on. In the model, the …rm’s RRB is represented by R. Ideally the regulator would like to set the revenue requirement equal to the …rm’s total economic costs so that the …rm earns zero economic pro…ts. As seen from equation (2), economic pro…ts depend on ( ; R) and e which I assume the regulator does not observe. The hidden type

and hidden e¤ort e

are standard components of private information in regulatory models in the spirit of La¤ont and Tirole (1986). As in La¤ont and Tirole (1986), I assume the regulator observes exp (

e) C but not

and e

separately. Thus a …rm with high operating cost may either be a high cost type or a …rm that did not exert e¤ort. The second dimension of unobserved type is the required RRB. Much of the debate in rate cases is on what the fair rate of return should be and what should be included in the rate base (i.e. prudently incurred investments that are used and useful). Instead of separately modeling these two components, I assume that R is private information of the …rm.13 Owners of the …rm are likely to have better information about what investments need to be pursued and what outside investment opportunities exist. The rate case acts as a platform for the regulated …rm to propose a revenue requirement and share relevant information, and for the regulator to use this information and decide on what revenue requirement to authorize. I assume that there is a …xed and known level of output that the …rm has to meet, and the rate case is about how to compensate the …rm for producing this output. I model the rate case as a signaling game. The purpose of the model is to provide predictions on how the …rm behaves during a rate case and immediately after it. Thus the full model is a three period (year) model: an initial period 13

I plan to explore in the future the case where the …rm can use capital as a signal.

14

where the …rm draws its type; a second period where the rate case actually occurs; and a third period which is the year after the rate case. I …rst present the timing of the full game and then discuss the …rm and regulator’s payo¤s and optimization problems after. The timing of the game is as follows: t = 0 (Initial) – Firm draws ( ; R) from a joint distribution F. t = 1 (Rate case) – Firm produces the required output by exerting e¤ort e1 . Firm proposes the return on the rate b and reports operating cost C b = exp ( base (RRB) denoted by R e1 ) C.

b R b and fully passes through reported operating cost, i.e. authorized – The regulator observes C; b To determine authorized RRB, the regulator decides on auditing expense is equal to C. intensity

2 [0; 1] and incurs auditing cost A ( ).

– An auditing technology leads to an authorized RRB denoted by R. The authorized revenue b + R, and the …rm is allowed to collect this amount at the end of the requirement is thus C period.

t = 2 (Post-rate case) – Given the authorized revenue requirement, the …rm produces the required output by exerting e¤ort e2 . The formal signaling part of the game occurs when the …rm proposes a return on the rate base b The regulator observes C; b R b and determines the revenue requirement as follows. First, (RRB) R. b the regulator fully passes through observed operating cost by setting authorized expense equal to C.

Second, to determine the authorized RRB, the regulator audits the …rm. In particular, the regulator chooses auditing intensity R. Larger values of

2 [0; 1] and this determines how close the authorized RRB R is to the true

re‡ect tougher auditing but this entails a nonlinear cost A ( ). I describe the

auditing technology in the discussion of the regulator’s problem.

h i bU for R, b and I make the following assumptions. First I assume that message spaces are 0; R b I assume R bU (0; exp ( ) C] for the reported cost C. RU which means I allow …rms to propose an 14 b I allow the …rm to RRB that is larger than the highest possible R. As for the message space of C, 14

This assumption is not necessary although it helps provide a clean characterization of equilibrium. The assumption

basically allows a fully-separating equilibrium to exist. Without the assumption, the fully-separating equilibrium becomes a partially-pooling equilibrium with R-types in the upper edge of its type space pooling on the signal RU .

15

report almost zero costs, which it can do if it exerts an extremely high level of e¤ort. Since a …rm with b through production, the largest possible C b is when it exerts zero e¤ort. type generates the report C

Second, I assume auditing cost is strictly increasing and strictly convex in the auditing intensity: A0 > 0 and A00 > 0. If auditing is viewed as a kind of information-gathering process, then this assumption means that it is cheap to gather information at the start but as the regulator exhausts the pool of useful information, new useful information is harder to get by.

3.2

Regulator’s problem

De…ne V as the sum of gross consumer surplus from output during t = 1 and 2. The revenue requirement

is the payment to the …rm collected from consumers and so this reduces consumer surplus. Authorized b The authorized RRB is expense is equal to the observed operating cost during the rate case, i.e. C. b + R. I assume that determined via auditing and is equal to R. Thus authorized revenue requirement is C

auditing costs A ( ) are shouldered by consumers so this reduces consumer surplus. Finally I assume that the regulator only cares about consumer surplus and thus welfare is15 WP U C = V

n o b+R 2 C

A( ).

The regulator is required to authorize a rate of return that is fair from the point of view of the …rm, given prudently incurred investment. The law does not provide speci…c guidance as to how the fair rate of return is determined except that the regulator, in determining the rate, “has made a reasonable attempt to ensure that the results of its actions are not con…scatory or unfairly burden any of the parties to the proceeding” (Joskow, 1974). To eliminate potential expropriation by the auditing technology, I assume that auditing is biased in the sense that it always produces an authorized RRB above R, i.e. R

R.

There is no obvious way to model the auditing technology. Banks (1992) and Baron and Besanko (1984) have modeled this as a perfect technology, i.e. the regulator chooses the probability of audit and auditing perfectly reveals R. The problem with such an interpretation is that there is no direct link between the authorized RRB in the data and the model. I instead model auditing as a result of a technology with the property that greater auditing intensity leads to an authorized RRB that is “nearer” the true R. Since auditing becomes more expensive as intensity increases, the regulator faces a tradeo¤ between authorizing a rate of return that is closer to R b and paying a lower auditing and paying a higher auditing cost, or authorizing a rate that is closer to R 15

This assumption is not critical for the results. It su¢ ces to have the regulator put a higher weight on consumer surplus

relative to producer surplus. Notation becomes more complicated when welfare is de…ned as a weighted-sum of consumer and producer surplus since the regulator has to apply its beliefs on the …rm’s pro…t.

16

cost. I assume R

b ) R.

R + (1

(3)

Thus an increase in auditing intensity puts more weight on the true RRB. A formal example of how to interpret the technology is as follows. Imagine that auditing intensity h i b and this distribution is decreasing in , in the …rst order generates a distribution with support R; R stochastic sense. Assume that the regulator authorizes an RRB equal to the mean of this distribution. Thus higher values of

would lead to authorized RRBs that are closer to R. A distribution that generates

(3) as its mean is a four-parameter beta distribution with shape parameters (1 b R and R.

) and , and bounds

The regulator knows how the auditing technology works. However it does not know true R and hence

the regulator has to form a conjecture when choosing . I denote this “belief” as %.16 Finally note that b 17 Nonetheless the the only observable that explicitly enters equation (3) is the …rm’s proposed RRB, R.

belief % is allowed to be a function of the reported operating costs so it implicitly enters equation (3)

through .

Putting all these together, the regulator’s problem is to choose auditing intensity WP U C = V b R b . after observing C;

n b + % + (1 2 C

b )R

o

2 [0; 1] to maximize

A( )

An interior optimal auditing strategy satis…es b A0 ( ) = 2 R

% .

(4)

b and the belief % R; b C b , the regulator chooses auditing intensity such that the marginal cost of Given R

auditing is just equal to the marginal bene…t. The marginal bene…t of auditing is the amount that the …rm gets from overstating its RRB when the regulator forgoes auditing. 16

Formally, a belief is a probability distribution

actually

17

b C b = E (R) = % R;

I have analyzed the more general model R

b where X C;

b C b ( ; R) j R;

R + (1

b )R

. What I call “belief” in the body of the paper is Z

Rd :

b X C;

is a punishment for exerting less e¤ort during the rate case compared to the “…rst best”. I still …nd that there

is an equilibrium where e1 = 0 (similar to Proposition 1 except

17

b C

> 0) and the data is consistent with this equilibrium.

3.3

Firm’s problem

I de…ne U as the sum of the …rm’s pro…t during and after the rate case, i.e. U = 1 + 2 . The …rm incurs b = exp ( operating cost C e1 ) C during the rate case but receives the authorized revenue requirement b + R at the end of the period. After the rate case the …rm incurs operating cost exp ( C

e2 ) C and

receives the authorized revenue requirement again. Thus …rm’s pro…t is given by U = 2 exp (

e1 ) C + R

[exp (

e1 ) C +

(e1 ) + exp (

e2 ) C +

(e2 )]

2R

b to maximize U given the the …rm’s conjecture about the regulator’s and the …rm chooses e1 , e2 and R auditing strategy.

b equates the marginal bene…t from increasing the proposal with its marginal An interior optimal R

cost:

(1

)=

b R

b (R

R)

b For a dollar increase in the proposed RRB, the authorized RRB increases by (1 = @ =@ R. ). b a¤ects the regulator’s auditing intensity. If the dollar increase makes auditing However the increase in R b R) re‡ects the loss of the …rm from a tougher audit. Thus any interior more intense, then b (R

where

b R

R

b Finally the marginal > 0 otherwise there is no cost to proposing larger values of R. b is decreasing in R. This feature allows sorting of types based on the cost of increasing the proposal R solution requires

b R

proposed RRB.

b can be seen as a markup over R where the markup re‡ects the elasticity of auditing The optimal R

intensity with respect to a change in the proposal:

b = R + (1 R

)

.

b R

The markup is larger the less sensitive auditing intensity is to increases in the proposed RRB. The …rm operates for two periods: during the rate case and after it. The optimal choice of e¤ort after the rate case satis…es 0

(e2 ) = exp (

e2 ) C:

It equates the marginal disutility of e¤ort with the cost-reduction due to e¤ort. Because the authorized revenue requirement is already …xed, the …rm is the residual claimant to all cost-savings due to e¤ort and so the marginal bene…t of e¤ort is equal to exp (

e2 ) C.

Optimal e¤ort during the rate case satis…es the inequality 0

(e1 )

h

2

b C

b R

R

18

i 1 exp (

e1 ) C

where

b C

b If this inequality is strict, then the …rm does not exert e¤ort. The term on the right= @ =@ C.

hand side is the marginal bene…t from exerting e¤ort. The …rm does not bene…t from cost-reductions during the rate case because these are fully passed through to consumers. Moreover, exerting e¤ort reduces next period’s revenues hence creating further disincentives to exert e¤ort. However if auditing is su¢ ciently increasing in the …rm’s operating cost, then the …rm may have incentives to exert a positive level of e¤ort. Thus a necessary condition for a positive optimal level of e¤ort is that auditing becomes tougher when the regulator observes larger operating costs. In this case, the …rm may have enough incentives to exert e¤ort.

3.4

Equilibrium

I use Weak Perfect Bayesian Equilibrium (PBE) as my equilibrium concept. In my context, a PBE is b C b for the de…ned as follows. Note that instead of specifying a probability distribution ( ; R) j R;

beliefs, I directly use

b C b = % R;

Z

Rd

and refer to this object as the …rm’s "beliefs". Finally, I restrict to di¤erentiable auditing strategies in the equilibrium de…nition. This allows me to characterize equilibria using partial derivatives of , which I denote as

b R

and

b. C

De…nition 1 A Weak Perfect Bayesian Equilibrium of the game is characterized by a set of strategies b ( ; R), e1 ( ; R) and e2 ( ; R) for the …rm; a di¤ erentiable strategy b C b and “beliefs” % R; b C b for R R; the regulator, such that 1. Given

b C b R;

b C b , the functions R b ( ; R), e1 ( ; R) and e2 ( ; R) maximize the …rm’s and % R;

pro…t U for each ( ; R); b and C, b 2. Given any R

b C b maximizes welfare WP U C under the belief % R; b C b ; R;

b C b are updated via Bayes’ rule, whenever possible. 3. Beliefs % R;

As in most signaling games, the rate case game has multiple equilibria. One approach to reducing the

set of equilibria is to adopt an equilibrium re…nement.18 Instead of applying an equilibrium re…nement, I focus on a particular separating equilibrium. I then check whether the data is consistent with predictions of this equilibrium in the next section. 18

For example, Banks’(1992) auditing model uses the Universal Divinity equilibrium re…nement of Banks and Sobel (1987)

to reduce the set of equilibria to a singleton. Besanko and Spulber (1992) adopt the same equilibrium re…nement in their model of investment of regulated …rms.

19

The equilibrium I will be focusing on has the following interesting feature. The regulator ignores the …rm’s operating cost during the rate case when deciding on its auditing intensity. This then eliminates any incentive for the …rm to exert e¤ort during the rate case. The …rm e¤ectively uses its proposed RRB to signal its true RRB and in equilibrium, the regulator’s beliefs are correct, i.e. b ( ; R) ; C b ( ; R) = R % R

b (the regulator does not care about when deciding and the regulator successfully sorts types R based on R

on auditing as far as R is already “known”). The regulator then chooses auditing intensity based on this correct belief. Note that although the regulator correctly infers R, it still needs to produce admissible evidence to support R which is done via auditing. This is the same assumption made in Bank’s (1992) auditing model. I characterize the equilibrium in the following proposition. bU is su¢ ciently high.19 The following fully-separating equilibrium exists: The Proposition 1 Suppose R

…rm exerts zero e¤ ort during the rate case,

e1 = 0.

After the rate case, the …rm chooses the “…rst-best” e¤ ort, i.e. e2 solves 0

(e2 ) = exp (

e2 ) C.

In this equilibrium, the …rm proposes RRB such that b = R + (1 R

)

.

b R

b in its auditing strategy The regulator ignores the operating cost signal C b C

= 0.

b C b , and in particular, R;

Finally, the regulator’s auditing strategy is increasing in the proposed RRB and is the solution to Z 0 A ( ) b d = 2R. 1 19

bU to be larger than a threshold R, e where R e solves What I need is for R (1

e C b )= R;

b R

e C b (R e R;

RU ).

That is, the optimal proposal of the largest type RU is still an interior proposal. Note that in this equilibrium b so R e does not depend on C b as well. function of C

20

is not a

Proof. See appendix. The main equilibrium predictions that I will check in the data are the following. First, operating costs will tend to be higher during the rate case compared to the year after. I also check whether heat rates are higher during rate cases since short-run variation in heat rates are more likely due to changes in e¤ort than changes in …rm’s capital or the skill-level of manpower. Second, the …rm’s auditing strategy will be ‡at with respect to operating costs during the rate case. Third, the …rm’s auditing strategy is increasing in the proposed RRB. After providing empirical support for the plausibility of this equilibrium, I use the equilibrium’s prediction about optimal e¤ort during and after the rate case to identify the distribution of types and the disutility function

( ).

An interesting question is whether there are equilibria that provide incentives to the …rm to exert e¤ort during the rate case. The following proposition characterizes a particular one. The complete characterization and the proof are in the appendix. Proposition 2 The following equilibrium exist: The …rm’s rate case e¤ ort e1 is positive and is the solution to

h

2

b C

bU R

R

i 1 exp (

e1 ) C =

0

(e1 ) .

After the rate case, the …rm chooses the “…rst-best” e¤ ort, i.e. e2 solves 0

(e2 ) = exp (

e2 ) C.

b = R bU . However the In this equilibrium, every type ( ; R) proposes the largest possible RRB, i.e. R

regulator ignores the proposed RRB in its auditing strategy, and in particular, b R

= 0.

b C

> 0.

b i.e. The regulator’s strategy is strictly increasing in C,

b as an informative signal about what R is, albeit Thus the regulator uses the …rm’s operating cost C b imperfectly since di¤ erent groups of types pool at di¤ erent values of C.

4

Data and evidence

I construct a list of generating units a¤ected by Phase I of the Acid Rain Program using compliance data from EPA’s Air Markets Program. The compliance data includes all generating units that were part of Phase I. For each unit in this list, I get unit-level data on net electricity generation and nameplate 21

capacity from the Energy Information Administration’s (EIA) Form 767 for the period 1988-1999. I aggregate the data to the plant-level and get data on emissions, fuel consumption (coal, oil and natural gas), and on whether the plant has a ‡ue gas desulfurization (FGD) unit installed. I then aggregate these measures at the utility-level so that I can match these to the regulatory and rate case data. Utilitylevel fuel prices are constructed from the Federal Energy Regulatory Commissions’ (FERC) Form 423 by averaging delivered prices across a utility’s plants for each fuel type. Finally I match these utilities to the regulatory database of SNL Financial and extract data on fuel expense and non-fuel operations and maintenance expense related to electricity generation, excluding expenses from nuclear plants. I also get average monthly salaries of full-time employees involved in electricity generation. This comprises the operations and cost data. The rate case data comes from Regulatory Research Associates (RRA), a research and consulting company owned by SNL Financial. These contain SNL utility codes that I use to match the rate case data with operations and cost data. I get data on the year the rate case was proposed, the year it was authorized, the test year, proposed and authorized rate base, and the proposed and authorized rate of return (ROR). From these data I can construct the proposed and authorized return on the rate base (RRB). I am able to identify 84 utilities that own at least one Phase I plant by matching the EPA data with the EIA data. Of these I can match 69 utility codes to SNL’s regulatory data. My primary variables are net generation from coal, oil and natural gas plants; emission rate; a dummy for whether the utility has at least one plant with an FGD; total nameplate capacity; and average prices for coal, oil and gas. The number of utilities with nonmissing data and with at least two rate cases during 1988-1999 goes down to 38. Table 1 contains summary statistics for these utilities. The number of …rm-year observations is 363. The O&M variable cost measure is the sum of fuel expense and non-fuel O&M expense related to electricity generation. Fuel expense accounts for about 75% on average. Moreover on average, coal accounts for about 92% of total fuel consumption (in MMBtu) while about 5% and 3% for oil and natural gas respectively. Table 2 contains rate case summary statistics for these utilities. On average, a rate case lasts just over a year and can extend for 3 years. The number of years from the time a rate case is authorized to a new rate case is proposed is 3 on average but can be as long as 6 years. A utility in my sample experienced between 2 to 3 rate cases during 1988-1999. The average RRB disallowance (proposed minus authorized) when measured as a percentage of proposed RRB is 7%. The percent disallowance in RRB ranges from 0%, i.e. no disallowance, to as high as 31%.

22

Table 1: Summary statistics of operations and costs data Variable

Mean

O&M var cost

330

$M

107

Std dev

Min

Max

257

23

1198

558739

9:9

2:4

107

Net generation

MwH

2:4

emission rate

lbs/MMBtu

1.77

1.04

0.23

7.22

Nameplate

MW

4999

480

232

23227

FGD dummy

FGD dummy

0.33

.

0

1

Salary

$000/emp/mo

16.1

8.3

4.3

52.3

Price coal

$/ton

32.76

10.06

12.48

53.80

Price oil

$/barrel

23.29

59.17

10.06

51.39

Price gas

$/MMBtu

2.96

1.03

1.34

15.48

107

Table 2: Summary statistics of rate case data Variable

Mean

Std dev

Min

Max

Rate case duration

Years

1.2

0.7

1

3

Time between case

Years

3.0

2.1

1

6

Proposed RRB

$M

312

398

7

1868

Authorized RRB

$M

287

365

5

1644

Percent disallow RRB

% of Prop RRB

7

4

0

31

Proposed rate base

$M

2536

3251

73

15963

Authorized rate base

$M

2376

3054

66

14485

Proposed ROR

%

10.2

0.9

7.9

12.2

Authorized ROR

%

9.8

1.0

7.4

11.8

23

4.1

Preliminary analysis

Examining O&M variable costs in and outside of the rate case, I …nd that O&M variable costs are about 5% higher during a rate case. The basic regression involves regressing the log of variable O&M costs on log of output20 (electricity and emissions), input prices (labor, coal, oil and gas) and capital (nameplate and indicator if the …rm has a scrubber), together with indicator variables for whether the observation comes from years when the rate case is ongoing. I construct three indicator variables. The …rst dummy is equal to one if the observation falls during the rate case, i.e. from proposed to authorized year, inclusive. The second dummy is equal to one if the observation falls on the year immediately after the authorization year. Finally the third dummy is equal to one if neither of the two dummies are one. In the regression, the omitted dummy category is the second dummy so dummy coe¢ cients measure the % di¤erence relative to the year after the rate case concludes. Table 3 contains results from di¤erent speci…cations of the basic regression. I suppress the estimates for the other explanatory variables in this table. Focusing on the estimates for the rate case dummy, we see that average O&M variable costs are 5% higher during a rate case compared to the year after. Moving to the “neither” dummy coe¢ cient estimate, we …nd no statistically signi…cant di¤erences in O&M variable costs among non-rate case years. These results hold even when controlling for output, input prices, capital, and year e¤ects, and also by looking at within …rm and within …rm-rate case variation. This pattern of O&M variable cost implied by the regressions can be rationalized by the equilibrium characterized in Proposition 1. In that equilibrium, the …rm has no incentive to exert e¤ort during the rate case. However, once the revenue requirement is …xed, the …rm becomes the residual claimaint of cost-savings induced by e¤ort. This provides incentives to exert e¤ort once the rate case has concluded. This pattern is just suggestive and can be rationalized by other stories. For example, since rate cases are initiated by the …rm, they might strategically initiate rate cases when they know costs will be high to lock in the rates. Thus in this story, exogenous di¤erences in costs that the …rm is aware of can explain this pattern. To further investigate whether the pattern is induced by …rm’s e¤ort, I look at whether the same pattern arises for heat rates, which is de…ned as the amount of fuel burned per unit of electricity produced. Short-run variations in heat rates are more likely due to e¤ort (either plant manager or operator) than di¤erences in equipment or skills. A higher heat rate means less e¢ cient production since the …rm burns more fuel to produce the same amount of electricity. I regress the log of heat rate on the log of electricity generated, log of capital, indicator for FGD, and the rate case dummies. State electricity demand is 20

I include speci…cations where I use state-level of electricity demand as an instrument for electricity output and regional

prices for low and high sulfur coal as instruments for emission rates. Low sulfur coal is de…ned as coal with sulfur content below 1.2 lbs/MMBtu. First stage F-statistics are 164 and 27 for electricity output and emission rates respectively.

24

Table 3: Regression results: O&M variable cost and rate case dummies. log O&M var cost

(3)

(4)

(5)

(6)

0:053

0:057

0:047

0:046

0:044

(0:028)

(0:027)

(0:022)

(0:017)

(0:019)

(0:021)

0:028

0:016

0:006

0:011

0:012

0:015

(0:046)

(0:044)

(0:020)

(0:025)

(0:026)

(0:028)

Year

Trend

Yes

Yes

Yes

Yes

Yes

Firm

No

No

Yes

No

No

No

Firm-Rate Case

No

No

No

Yes

Yes

Yes

IV for electricity

No

No

No

No

Yes

Yes

IV for emission rate

No

No

No

No

No

Yes

Num. Obs.

363

363

363

363

314

314

Rate case

Neither Rate case nor Year after

(1)

(2)

0:052

Notes: Standard errors are either clustered at …rm or …rm-rate case level. Regression via OLS except when indicated. Additional regressors are a dummy for FGD; the logs of electricity output, emission rate, input prices (labor, coal, oil and gas), and nameplate rating. I use log of state electricity demand as an IV for electricity output and regional prices for low (C b0 . for any C

.

b C b R R;

i

.

b C b = R;

b C b0 R;

bU be the largest possible proposed RRB. For any R b and R b0 in the data such that R bU > R b>R b0 , 2. Let R we have

Proof. See appendix.

b C b > R;

b0 ; C b , R

b C b > R;

> 0. Proposition 3 allows us b and proposed RRB R. b to check these predictions using data on disallowances , O&M variable cost C b and R b respectively. These Figures 2 and 3 contain partial (local polynomial) regression plots of on C The equilibrium characterized by Proposition 1 involves

b C

= 0 and

b0 ; C b . R b R

partial regression plots are constructed as follows. Consider the partial regression plot with respect to b I …rst regress b output, capacity, and …rm-, year- and state-…xed e¤ects. Then I get the C. on R, 27

residual and normalize the location by adding back the mean of

b on the same set . Next, I regress C

of explanatory variables, get the residual, and normalize the location. Finally I do a local polynomial b residual. I do the same for R b but replacing regression of the normalized residual on the normalized C b in place of R b in the set of explanatory variables. These partial regression plots provide support for C b C

= 0 and

b R

> 0 and hence the plausibility of the equilibrium characterized by Proposition 1. The

next section discusses how Proposition 1 is used for identi…cation.

Figure 2: Partial (local polynomial) regression of disallowances

on operating costs C^

0

10

Delta 20 30

40

50

Loc al poly nomial s mooth

250

300

350

400

C_hat 95% CI

disallowratehatX

lpoly s mooth

kernel = epanechnikov, degree = 2, bandwidth = 27.78, pwidth = 41.67

Notes: The curve is a local polynomial regression of the normalized residuals of disallowances ^ and serves as an estimate of operating costs C, con…dence interval. The residuals for

^. C

on normalized residuals of

The …gure also shows the scatter plot of these residuals and also a 95%

come from a linear regression of

^ electricity output, on return on the rate base R,

^ come from a linear regression on the same capacity, and …rm-, year authorized- and state-…xed e¤ects, while residuals for C set of regressors. I normalize the residuals by adding the respective means. Dropping either …rm- or state-…xed e¤ects does not qualitatively change my estimate.

28

Figure 3: Partial (local polynomial) regression of disallowances

^ on return on the rate base R

0

10

Delta 20 30

40

50

Loc al poly nomial s mooth

160

180

95% CI

200 R_hat disallowratehat

220

240

lpoly s mooth

kernel = epanechnikov, degree = 2, bandwidth = 20.24, pwidth = 30.36

Notes: The curve is a local polynomial regression of the normalized residuals of disallowances ^ and serves as an estimate of return on the rate base R, a 95% con…dence interval. The residuals for

^. R

on normalized residuals of

The …gure also shows the scatter plot of these residuals and also

come from a linear regression of

^ electricity output, on operating costs C,

^ come from a linear regression on the same capacity, and …rm-, year authorized- and state-…xed e¤ects, while residuals for R set of regressors. I normalize the residuals by adding the respective means. Dropping either …rm- or state-…xed e¤ects does not qualitatively change my estimate.

5

Empirical model

I estimate a multiproduct cost function to provide a measure of (marginal) abatement costs for electric utilities.22 The cost of reducing emissions is re‡ected by the increase in the cost of producing electricity due to changes in production methods, i.e. fuel-switching. I restrict attention to costs, output, emissions and input choices related to coal, oil and gas plants. The analysis is done at the utility-level since rate regulation and rate cases involve the …rm as a whole. Moreover, Ellerman et al (2000, p. 301) remark that compliance decision-making is often made at the utility-level even if pollution regulation per se is at the unit-level. I assume a stochastic speci…cation for realized O&M variable costs of producing electricity and emis22

Carlson et al (2000) estimate a similar multiproduct cost function to get abatement costs for fuel-switching plants.

29

sions. For …rm i at time t, realized O&M variable cost is given by eit = exp(! it )C(qit ; sit ; plit ; pf it ; Nit ; dF GDit ; ) exp ("it ) C

where

! it =

(5)

eit

it

pf it = (pcit ; poit ; pgit ) C (q; s; pl ; pf ; ; ) = N The term exp(!) = exp (

N

exp (

F GD dF GDit ) q

q

s

s + sd dF GD

pl l pc c po o pg g .

e) is the unobserved cost e¢ ciency of the utility where

intrinsic type and e is unobserved managerial e¤ort. The utility knows

is the …rm’s

and chooses e. The function

C(q; s; pl ; pf ; N; dF GD ; ) is the baseline cost function of the utility where q is net generated electricity, s is the SO2 emission rate, pl is the average salary for full-time employees related to electricity generation, pf is a vector composed of fuel prices23 for coal, oil and gas, averaged across the utility’s plants, N is the sum of nameplate ratings of the utility’s plants, and dF GD is a dummy equal to one if the utility has at least one plant with a ‡ue-gas desulfurization (scrubber) unit installed. This baseline cost captures di¤erences in O&M costs that can be explained by di¤erences in input prices, outputs and capital. The vector

contains the parameters of the baseline cost function that need to be estimated. Finally "

is a mean zero stochastic error term that summarizes factors that a¤ect realized costs. I assume " is unanticipated by the …rm when making its input choices and uncorrelated with the regressors. I assume the …rm’s intrinsic type

it

is a draw from the distribution F . Ideally F would be con-

ditioned on variables such as …rm’s capacity or portfolio of plants, but as a …rst step I assume F is a function of the rate case year. Next, the reason why the …rm’s unobserved type is indexed by t is that I allow

to change across di¤erent rate cases. However I assume that

remains constant between rate

cases. Let t be the time index (year) for a speci…c …rm’s rate case . For example, if …rm i has three rate cases during the sample period, then Formally, for all i, t and , it

I assume for each …rm i,

it1

=

(

2 f1; 2; 3g which occurs on years t1 , t2 and t3 respectively. if t 2 [t ; t

it it

is a draw from F ,

+1

it2

if t = t

+1 )

+1 .

is a draw from F

j

it1

,

it3

is a draw from F

j

it2

, etc.

In the next subsection, I discuss identi…cation of the distribution of types F , the disutility function

( ) and the baseline cost function parameters . 23

Fuel prices are either spot or contracted prices. Managerial e¤ort can a¤ect the actual price the …rm faces and hence

introduces an endogeneity problem.

30

5.1

Identi…cation

There are two interrelated challenges for identi…cation.24 The …rst challenge is an endogeneity problem in identifying the cost parameters unobserved type exp (

it

. The second challenge involves extracting the distribution of the

from the variation in realized costs that is unobserved by the econometrician, i.e.

eit ) exp ("it ). The …rst challenge arises precisely because eit is an endogeneous variable chosen

by the …rm. Moreover, cost e¢ ciency (! it =

it

eit ) a¤ects electricity output and potentially input

prices. The …rm’s baseline cost is the main variable that determines what level of e¤ort to exert since this captures the cost reductions from e¤ort. The …rm’s cost e¢ ciency a¤ects electricity output since regulated electricity prices are based on reported expenses. Finally, plant managers in charge of fuel procurement may a¤ect the actual price the …rm pays for its fuel. If ! it is observed by the econometrician, then we can directly identify . However ! it is not observed and therefore we need to …nd a way to control for it. Furthermore, I need to extract the distribution of

from the unobserved variation exp (

eit ) exp ("it ).

it

My identi…cation strategy involves two parts. First, to identify the parameters of the empirical model, I use Proposition 1 to pin down ! it for di¤erent time periods. This allows me to take di¤erent transformations of the data to eliminate ! it from the estimating equations. Second, to identify the distribution of intrinsic types , I recast the problem under the framework of measurement error with repeated measurements (e.g. Li and Vuong (1988)) and use the deconvolution result of Kotlarski (1967). I brie‡y mention an alternative identi…cation strategy at the end. 5.1.1

Identi…cation of parameters

In the equilibrium characterized by Proposition 1, the …rm does not exert e¤ort during the rate case, hence ! it =

it

.

After the rate case, i.e. at time t = t + 1, the …rm exerts e¤ort such that 0

(eit

+1 )

= exp (! it

+1 ) C(qit +1 ; sit +1 ; plit +1 ; pf it +1 ; Nit +1 ; dF GDit +1 ;

To determine what ! it is after the rate case, I impose the following functional form25 for 24

I focus on identi…cation of the distribution of

). ( ):

and leave identi…cation and estimation of the distribution of R in the

appendix. Although the …rm’s type is two-dimensional, the screening problem that I solve to derive the optimal mechanism is one-dimensional. The reason is that the regulator does not have any instrument to screen R and so every R-type reports the highest possible R. I plan to explore a richer model where R is an explicit function of installed capital and some unobserved type (e.g. rate of return). Thus capital can be a screening variable. The screening problem then becomes a nonseparable two-dimensional problem. Non-separability arises because capital enters operating costs. 25 Gagnepain and Ivaldi (2002) uses a similar exponential form for the disutility function.

31

Assumption 2 The disutility of e¤ ort is given by (eit ; where

is a parameter and

it ’s

it )

=

1

exp ( eit +

it )

are mean zero shocks that are uncorrelated with (qit ; sit ; plit ; pf it ; Nit ; dF GDit )

and iid across i and t. Remark 1 I do not include a constant in the speci…cation for the baseline cost function and also for ( ). The reason is that these are not identi…ed when I include a constant it

across rate cases. That is, the means of " and

0

later in the evolution of

are subsumed in the mean of the error term in the

evolution of . Assumption 2 allows me to express ! it as a linear function of Cit ( ) = C (qit ; sit ; plit ; pf it ; Nit ; dF GDit ; ) and the shock ! it =

1 1+

(

it ,

it :

ln Cit ( ) +

it

for t = t + 1. Proposition 1 and the assumption that

the log of the baseline cost function

it

it )

(6)

is constant within rate cases give expressions

for realized costs during di¤erent “events”: eit ln C

eit ln C

+1

= =

+ ln Cit ( ) + "it

it

1+

(

it

(7)

1 + ln Cit +1 ( )) + 1+

it +1

+ "it

+1

(8)

for all rate cases . The …rst line is the realized cost during rate cases, while the second line is for the year after the case. Although that

it

it

is constant within rate cases for each …rm i, I allow

it

to vary across rate cases. I assume

follows a linear process across two rate cases :

Assumption 3 For each i and , intrinsic types across two rate cases it

where ( 0 ;

1)

are parameters and

it

=

0

+

1 it

1

+

and

1 evolve according to

it

’s are iid across i and t .

Assumption 3 provides a way to di¤erence out cost e¢ ciency ! it . Using assumption 3, I can quasidi¤erence equation (7) for two consecutive rate cases. This yields eit ln C

e

1 ln Cit

1

=

0

+ ln Cit ( )

32

1 ln Cit

1

( )+

it

where =

1it

+ "it

it

1 "it

( ;

1it

1

0; 1) .

I can then construct moment conditions E where zit it

1

= qit

1

; sit

1

; plit

1

; pf it

and "it are iid across t, and "it

( ;

1it

1

0; 1)

; Nit

1

zit

=0

1

; dF GDit

0

1

(9)

. These moment conditions hold because

is an unanticipated shock during t

1

1.

Another way to di¤erence out ! it is by looking at observations during and after the rate case. Specifically, consider the following quasi-di¤erence across t eit ln C

where

+1

eit = ln C

1+

1 1+

=

2it

it +1

and t :

(ln Cit

1+

+ "it

+1

+1

+1 (

)

ln Cit ( )) +

"it

1+

2it

2it

( ; ).

I can rewrite this as eit ln C

+1

=

(ln Cit

1+

+1 (

)

ln Cit ( )) +

from which I can construct the moment condition h eit E 2it ( ; ) ln C eit is correlated with Since ln C

1

i

eit through "it , I use ln C

2it

1+

eit + ln C

2it

= 0.

(10)

eit . Realized as an instrument for ln C

1

cost during the previous rate case is uncorrelated with the shock in the current rate case. Moreover, eit e C 1 6= 0. 1 will be correlated with Cit as long as Finally consider the following quasi-di¤erence across t and t 1+

where

eit ln C 3it

=

e

1 ln Cit

it

1+

1 +1

=

+ "it

1+

1

0

+

1+

1 1+

it

+ 1:

1

ln Cit ( )

1 +1

+ "it

1 ln Cit

1 +1

3it

1 +1

( ; ;

( ) +

3it

0; 1) .

I rewrite this as 1+

eit = ln C

1+

0

+

and construct moment conditions " E

1+

3it

ln Cit ( )

( ; ;

0; 1)

33

1 ln Cit

1 e ln Cit

+1

1 +1

( ) +

!#

= 0.

e

1 ln Cit

1 +1

+

3it

(11)

eit Notice that I have used ln C

eit +1 . Realized cost in t = t +1 is uncorrelated as an instrument for ln C 1 e with past shocks but is correlated with Cit 1 +1 through the evolution of . The parameters ,

+1

and ( 0 ;

1)

are identi…ed as the solution to the moment conditions (9), (10) and

(11). Uniqueness of the solution can be seen by taking each equation one at a time. For example, given ( 0;

1 ),

equation (9) is linear in ; given , equation (9) is linear in = (1 + ) which uniquely pins down

; and given 5.1.2

and , equation (11) is linear in ( 0 ;

1 ).

Identi…cation of type distribution

Given the parameters and using assumption 3, I can rewrite realized cost during two consecutive rate cases as eit ln C

ln Cit ( )

it

1

+

it

ln Cit

1

+ "it

+ "it 1

The problem of …nding the distribution of it

=

1

eit ln C

repeated measurements. Let

0

=

1

( ) =

it

1

+ "it

1

.

can be recast in the framework of measurement error with 1

and "it

1

be the “measurement errors” while

it

1

is the

latent variable. The two measurement errors and the latent variable are all mutually independent and this follows from the assumptions on

it

characteristic functions of

+ "it

it

1

,

it

and the unanticipated cost shocks. Let =

1

and "it

1

respectively. Assuming

,

U1

,

and

U1

and

U2

be the

U2

have

no real zeros26 , Kotlarski’s (1967, Lemma 1) identi…cation result imply27 Z t @ Y (0; t2 ) =@t1 dt2 (t) = exp 0 Y (0; t2 ) Y (t; 0) U1 (t) = (t) Y (0; t) U2 (t) = (t) where

Y

( ; ) is the characteristic function of

eit ln C

ln Cit ( ) 1

0

eit ; ln C

ln Cit

1

1

( ) . Since char-

acteristic functions uniquely determine the distribution of random variables, we can therefore identify the distribution of

it

1

from the distribution and characteristic function of eit ln C

26

ln Cit ( ) 1

0

eit ; ln C

1

ln Cit

1

!

( ) :

Arellano and Bonhomme (2012) provide intuition for this technical requirement. When the characteristic function of

the measurement errors are zero at certain points or intervals, the characteristic function of the observed measurements is not informative about the latent variable. Evdokimov and White (2012) replace this assumption with weaker conditions. 27 See Rao (1992) and Li and Vuong (1998).

34

5.1.3

An alternative identi…cation strategy

The functional form assumptions on the evolution of types across rate cases and the e¤ort disutility function can be relaxed if one is willing to (1) make a timing assumption on the input choice decision of the …rm; (2) assume that the …rm chooses its inputs to minimize cost conditional on cost e¢ ciency ! it ; and (3) assume that this cost minimization problem leads to a cost function where cost e¢ ciency enters multiplicatively. Assume that natural gas is the only ‡exible input (i.e. other inputs are decided before observing electricity demand). Using Shephard’s lemma and the assumption that cost e¢ ciency and the unanticipated cost shock enters multiplicatively in realized cost, we can construct the following estimating equation based on the expenditure share of natural gas: 2 3 it pgit @C @pg pgit xgit 5 = log 4 log Cit C~it

"it

where xgit is the level of natural gas consumption. One can then identify the distribution of unanticipated cost shocks from this equation. This strategy is similar to the strategy recently developed by Gandhi et al (2011), which uses the revenue share of the ‡exible input and pro…t maximization behavior to identify the unanticipated output shock. I use the dual problem of cost minimization instead. Given this distribution, the only unobservable left in the stochastic speci…cation of realized cost (equation (5)) is cost e¢ ciency ! it . During rate cases, ! it = have the more general assumption that

it

it .

I can relax assumption 3 and instead

evolves as a Markov process across rate cases, following the

production function literature (e.g. Olley and Pakes (1996); Levinsohn and Petrin (2003); Ackerberg et al (2006); and Gandhi et al (2011)). One can then construct moment conditions to identify and estimate parameters by exploiting the orthogonality of explanatory variables from past rate cases with the error from predicting today’s intrinsic type

it

(using

from past rate cases).

Once the distribution of unanticipated cost shocks and parameters are identi…ed, the distribution of can also be identi…ed. Implied e¤ort levels can then be generated by looking at cost e¢ ciencies after the rate case. Proposition 1 can then be used to nonparametrically identify the disutility function using the generated e¤ort levels and cost data.

5.2

Estimation

I discuss how I estimate the parameters of the cost function, disutility function and the evolution of types across rate cases, and how I estimate the distribution of estimate the auditing strategy

it

1

. The appendix provides details on how I

and the distribution of true return on the rate base R from rate case

data. 35

To estimate the parameters, I use the sample analog of the moment conditions given by equations (9), (10) and (11). Ideally I would have a single estimating sample to construct the three moment conditions. However these moment conditions taken together require each …rm in the sample to have at least two rate cases that are initiated and completed in the period 1988-1998. This leaves me with just 22 …rms. The vector

contains 9 elements and therefore I need to estimate 12 parameters in total. To increase

the number of …rms, I treat the same …rm in two di¤erent rate cases as if they were di¤erent …rms.28 For example, I can de…ne two di¤erent “…rms”as (…rm i; rate case ) and (…rm i; rate case

+ 1). Although

there is dependence across these two …rms, this dependence is captured by intrinsic types

across rate

cases. Thus di¤erencing out ’s essentially gives independent samples (conditional on observables z). To further alleviate the problem of a small sample size, I construct di¤erent samples for each of the moment conditions. Moment condition (9), which identi…es the cost function parameters conditional on ( 0 ;

1 ),

only depends on rate case years. I include observations with rate cases initiated on or before 1999 even if they are concluded after 1999 in estimating this moment condition. Sample selection bias may arise because the timing of the rate case is partly controlled by the …rm. Suppose a rate case is initiated by the …rm at time t

+1

when at time t 2 (t + 1; t

the …rm initiates a rate case at t

+1

+1 )

realized costs are above some threshold. Whether

or not depends on the time t 2 (t + 1; t

+1 )

values of observables,

cost e¢ ciencies, unanticipated cost shocks ", and the unobserved threshold that is unrelated to "it (otherwise this threshold provides information about "it

+1

hence "it

+1

+1

will be anticipated by the …rm).

Selection bias thus arises because cost e¢ ciencies ! it are unobserved by the econometrician and these are correlated across time through the …rm’s intrinisic type

it .

My estimating equations di¤erence out

! it ’s. Therefore, using di¤erent samples does not introduce sample selection bias of this nature. Finally, I use a bootstrap procedure that samples over the “…rms” to compute standard errors since the moment conditions are based on di¤erent samples. To estimate the distribution of

it

1

, I use the algorithm described in Beran and Hall (1992) which

adapts the discrete approximation of Hausdor¤ (1923).29 The idea is to approximate the distribution of it

1

by a discrete distribution that is constructed from estimated moments of

it

1

. The algorithm is

implemented as follows: 28 29

I make this assumption when estimating the parameters, but not when estimating the type distribution. eit ln C ln Cit ( ) 0 eit An alternative procedure is to estimate the characteristic function of ; ln C ln Cit 1

derive the characteristic function of

it

1

, and then use an inverse Fourier transform to get the density of

it

1

( ) ,

(see for

example Li and Vuong (1998) and Krasnokutskaya (2011)). Li and Vuong (1998) note that the procedure of Beran and Hall (1992) is a special case of their estimation procedure since all moments of the distribution are used to estimate the distribution. Beran and Hall (1992) instead only use a …nite number of moments and apply the discrete approximation of Hausdor¤ (1923). To the extent that the distribution of

it

can be captured by a …nite number of moments, the Beran and

Hall (1992) procedure requires less data since this introduces less bias from (implicitly) estimated higher-order moments.

36

1. I estimate the …rst m = 15 moments of

it

1

using data on

I only use data for the …rst two rate cases for each …rm. 2. De…ne the k-th moment of of

it

1

it

1

as

k.

eit ln C

ln Cit ( )

0

1

eit ; ln C

1

ln Cit

1

( ) .

Following Beran and Hall (1992), I assume the distribution

, i.e. F , is supported on the compact interval [ c; c] where p c=5

2.

De…ne the transformed moment

for k = 0; 1; 2; :::; m where

0

ek =

k X j=0

k j

!

(2c)

j

2

(k j)

k

= 1.

3. Construct the discrete distribution over f0; 1=m; 2=m; :::; 1g with ! m j m j ek = Pr m j for j = 0; 1; 2; :::; m, where

r

is the r-th order di¤erence operator de…ned as ! r X r r ek = ( 1)i ek+i . i i=0

Hausdor¤ (1923) shows that this discrete distribution converges to F (Shohat and Tamarkin, 1943, p. 93-94).

I construct an estimate of the discrete distribution by using the estimated moments of of

k.

it

1

in place

To use this distribution in the counterfactual welfare simulations, I …rst …t a 6th order polynomial30

to the cumulative distribution function (cdf) of the discrete distribution. I then invert this polynomial, draw a random sample of size 50, and …nally use this as my sample.

5.3

Results

Table 5 presents the parameter estimates. The …rst two columns present results from the procedure described in the previous subsection. The coe¢ cient on the emission rate imply that for a 10% decrease in emission rates, O&M variable cost increases by 3.6%, and this is signi…cant at the 5% level. If the 30

Beran and Hall (1992) use the polygonal approximant (see Feller (1971, p. 540)) to the cdf of the discrete distribution.

Basically the polygonal approximant convolutes a uniform distribution between two points of the discrete distribution. That is, one draws a line that connects two steps of the cdf.

37

utility has at least one ‡ue-gas desulfurization unit, the e¤ect of decreasing emission rates goes down by about half. To interpret the coe¢ cient on log electricity output, I compute a simple measure of singleoutput returns to scale using Nelson’s (1985) equation (7) for variable cost functions. My estimates imply a returns to scale of 1.69 which tends to be high. For example, recent estimates of returns to scale range from 0.99 to 1.56 (Kleit and Tecrell, 2001). Cost elasticities for the …rm’s variable inputs imply cost shares of roughly 30%, 60%, 7% and 4% for labor, coal, oil and natural gas inputs. To interpret the estimated disutility function, suppose the …rm’s cost when it does not exert e¤ort is $100M, while at the optimal level of positive e¤ort, the …rm reduces its cost by 5%. At the optimum, the marginal disutility is equal to the marginal cost reduction: exp ( e ) =

@ [exp ( @e

e) C ( )]

= exp (

e ) C ( ) = $95M.

e

Thus 1

exp ( e ) = $19M

and so a 5% reduction from $100M incurs a level of disutility valued at $19M. My chosen e¤ort disutility function is not a function of …rm attributes. As a robustness check, I include the total nameplate capacity of the utility and the proportion of coal burned relative to total fuel. A …rm with more or larger plants might be more di¢ cult to manage. Moreover, in interviews with plant engineers and managers, Bushnell and Wolfram (2007) note that there is greater scope for an individual plant operator’s skill and e¤ort to a¤ect plant e¢ ciency among coal plants. Thus monitoring the operator’s performance is likely to be more di¢ cult in coal plants. Estimated coe¢ cients on these variables are positive although only the coal ratio is signi…cant (10% level). The estimated coe¢ cient on e¤ort, i.e.

, is smaller but the 95% con…dence

interval still contains my previous estimate. The estimated evolution of intrinsic types show strong persistence. The coe¢ cient on the past rate case’s intrinsic type is 1.002 and this is statistically signi…cant at the 1% level. An interesting question is whether a …rm …xed e¤ect would be su¢ cient to capture the unobserved heterogeneity in cost e¢ ciencies given the high persistence of intrinsic types across rate cases. The last two columns of table 5 show the estimates from a regression model with …rm …xed e¤ects and year dummies. Focusing on the estimates of the coe¢ cients on electricity output and emissions, we see that the estimates from the …xed e¤ect model are attenuated. Although the …rm …xed e¤ect can capture the variation in cost e¢ ciencies due to variation in intrinsic types across …rms, the …xed e¤ect fails to capture the e¤ect of endogenous e¤ort on cost e¢ ciency. The upward bias in the coe¢ cient on emission rates can be explained as follows. Think of e¤ort as an omitted variable and imagine that emission rate is the only regressor. This omitted variable is negatively correlated with cost and negatively related to emission rates (because lower emission rates increase cost, which increases the marginal bene…t from exerting e¤ort). Thus there will be upward bias. 38

Table 5: Parameter estimates log O&M variable cost

Model

FE

Est

SE

Est

SE

log emission rate

-0.356

0.200

-0.210

0.035

log emission rate*FGD

0.177

0.134

0.161

0.045

log Electricity output

0.694

0.376

0.458

0.044

log Price of labor

0.297

0.121

0.056

0.056

log Price of coal

0.595

0.143

0.659

0.062

log Price of oil

0.065

0.054

0.200

0.052

log Price of gas

0.043

0.044

0.085

0.044

log Nameplate

-0.174

4.515

-0.172

0.102

FGD

-1.371

3.97

-0.203

0.047

Disutility ( )

4.975

2.423

.

.

Type evolution ( 0 )

-0.124

1.342

.

.

Type evolution ( 1 )

1.002

0.277

.

.

Notes: The …rst two columns contain estimation results from the procedure described in section 5.2. Standard errors are computed using bootstrap, where sampling is over …rm-rate case. The last two columns contain OLS estimates with …rm and year dummies included. Standard errors are clustered at the …rm-level. Signi…cance level: * 10%, ** 5%, *** 1%.

An upward bias in the coe¢ cient on emission rates leads to underestimated marginal abatement costs (MAC) since M AC =

@ [exp ( @s

e) C ( )] = j

s j exp (

e) C ( ) s

1

.

Figure 4 plots the discrete approximation to the cumulative distribution of intrinsic type , and the …tted polynomial. I draw a sample of size 50 using this …tted polynomial. The distribution of

implies

a distribution of MACs and I plot the histogram of MACs in …gure 5. In generating the distribution of MACs, I assume (i) all …rms have emission rate of 2.5 lbs per MMBtu, (ii) observable variables (electricity output, input prices and fuel burned) are at their median values, (iii) …rms do not have FGDs installed (i.e. dF GD = 0), and (iii) …rms exert optimal positive e¤ort. The emission standard of 2.5 is the implicit emission standard under Phase I of the Acid Rain Program, so …gure 5 re‡ects the distribution of MACs if SO2 regulation were implemented by a uniform emission standard. There is considerable heterogeneity 39

in MACs. The median MAC is $182 per ton while the mean MAC is $325. The 75th percentile is $365 so most of the mass of the distribution is in the sub-$400. The 90th and 95th percentiles are $869 and $1405 respectively, so there is a nonneglible mass of …rms that have MACs above $800. A more ‡exible pollution regulatory regime takes advantage of the heterogeneity in cost e¢ ciencies. For example, the regime that minimizes the total cost of achieving the same level of abatement can be implemented by setting a uniform emission tax equal to $113 per ton and letting …rms decide their emission rates. Annual cost-savings under this regime are about $12M per …rm. Figure 4: Estimated cdf of Es timated c df of θ 1.2

1

0.8

0.6

0.4

0.2

0 Dis c rete approx Poly nomial fit -0.2 -6

-4

-2

0 θ

2

4

6

Notes: Stepwise cdf is the estimate using Hausdor¤’s (1923) discrete approximation and the Beran and Hall (1992) algorithm. The curve is a 6th order polynomial …t to this cdf.

6

Counterfactual welfare

The social planner’s responsibility encompasses both pollution and economic regulation. Pollution regulation is concerned with emission rates while economic regulation deals with how the …rm will be paid for providing its services. I focus on emission rates as the regulatory variable, taking the quantity of electricity, capital and input prices as exogenously given. A regulatory regime is a direct revelation contract that speci…es a bundle (s; e; t) for each type ( ; R). The bundle consists of an emission rate s, a level

40

Figure 5: Histogram of marginal abatement costs in $ per ton of SO2 emissions His togram of MAC s 25

20

15

10

5

0

0

200

400

600

800 $/ton

1000

1200

1400

1600

Notes: The …gure contains the histogram of marginal abatement costs (MAC) for the random sample I drew from the estimated type distribution. MACs are evaluated at an emission rate of 2.5 lbs/MMBtu and expressed in 1995$ per ton.

of managerial e¤ort e and a lump-sum transfer t.31 The lump-sum transfer should be su¢ cient to cover both the cost of producing electricity and abatement. Di¤erent regimes correspond to di¤erent mappings between types and bundles. The planner cares about social welfare which is given by equation (1) which I reproduce here: Z W = fV (q ( ; R)) D (s ( ; R)) (1 + ) t ( ; R) + ( ; R)g dF

(1)

Moreover, the planner faces constraints in designing the regime. First, the planner needs to satisfy individual rationality constraints which require leaving …rms with nonnegative economic pro…ts: ( ; R) = t ( ; R)

fexp [

e ( ; R)] C (s ( ; R)) +

[e ( ; R)] + Rg

0

(12)

for all ( ; R). As in La¤ont (1994) and La¤ont and Tirole (1986), I assume the social planner observes realized cost but not the …rm’s type and e¤ort. Thus the planner also face an informational constraint. This informational constraint is captured by incentive compatibility constraints ( ; R)

t

0

; R0

exp

e

0

; R0

31

C s

0

; R0

+

e

0

; R0

+R

(13)

The level of e¤ort e can be part of the contract since the social planner observes the …rm’s cost and can then recover e t where C e what e is, assuming the contract is incentive compatible. An equivalent way of specifying the contract is s; C;

is the …rm’s realized operating cost.

41

for all ( ; R) and

0

6=

or R0 6= R. These constraints ensure that a type ( ; R) does not have an incentive

to pick some other type’s bundle.

Although the …rm’s type is two-dimensional, the screening problem can be reduced to a singledimensional screening problem. Since there is no action to screen R, all of the …rms will report the highest possible R. This holds in any regime and thus R’s do not play a role in comparing welfare (except for the full information regime). From hereon I will just treat

as the …rm’s type and ignore R. I let

be the random sample of ’s that I have drawn from the estimated distribution of . I have N = 50 types in total. I use the median values of electricity output, input prices, amount of fuel burned and capital to compute welfare. I also assume …rms do not own a plant that has a ‡ue-gas desulfurization unit. Given a regulatory regime, I compute X f (p; ) = 1 f p S (s ( )) W N

(1 + ) t ( ) +

( )g

2

where

( ) = t( ) = N

exp (

e ( )]

F GD dF GD ) q

s( ) q

s

+

pl l pc c po o pg g

o [e ( )]

< 0.

s

The linear function S (s ( ) ;

N

n exp [

j)

converts an emission rate s ( ) to tons of SO2 emissions using the median

amount of fuel burned. I impose a linear pollution damage function so p represents the constant marginal damage from a ton of pollution.32 The variable is the social cost of public funds. I treat (p; ) as f for di¤erent combinations of (p; ). Finally, the welfare metric simulation parameters and I compute W f does not include the surplus from electricity consumption and thus I focus on W f W fU E , where W fU E W f is the corresponding welfare metric for the uniform emission standard regime. W

welfare gain of a given regulatory regime relative to the uniform emission standard.

6.1

fU E measures the W

Regulatory regimes

f under the following regulatory regimes: I compute W 32

Allowing for a more complicated nonlinear damage function necessitates sophisticated techniques to estimate marginal

damages across sources. Fowlie and Muller (2012) perform welfare analysis for non-uniformly mixed pollutants by utilizing the method for computing marginal damages developed in Muller and Mendelsohn (2009). They do not touch on issues arising in regulation with asymmetric information and costly information rents which is my main focus.

42

Full-information The planner observes

and e so incentive compatibility constraints are not rel-

evant. De…ne the …rst best allocation as the pair sF B ( ) ; eF B ( ) that solves p

dS (s ( ) ; j ) = ( s ) (1 + ) exp [ e ( )] ds 0 [e ( )] = exp [ e ( )] ( j) s ( )

( s

j)

s( )

s

1

.

The planner pays …rms a transfer that is just enough to cover costs. Thus the full-information regime is characterized by 7! sF B ( ) ; eF B ( ) ; exp Optimal regulation

eF B ( )

C sF B ( ) +

eF B ( )

.

The planner chooses (s; e; t) to maximize welfare subject to individual ra-

tionality and incentive compatibility constraints. The optimal mechanism is fully characterized in the appendix and is similar to the mechanism characterized in Proposition 2 of La¤ont (1994). Allocations (s ( ) ; e ( )) deviate from sF B ( ) ; eF B ( ) except for the most e¢ cient type, because of the planner’s desire to reduce information rents. The most ine¢ cient type earns zero pro…ts while the rest earn strictly positive pro…ts. Uniform emissions standard (s = e¢ cient standard) The planner requires s ( ) to be equal to the emissions standard s for all . The e¢ cient uniform emission standard is the emission rate that maximizes allocative e¢ ciency under the constraint that all …rms have s ( ) = s. Given s, the planner induces the …rms to choose e¤ort e ( ) such that 0

[e ( )] = exp [

e ( )]

(

j)

s

s

and o¤ers transfers that satisfy individual rationality and incentive compatibility constraints. Emission tax The planner sets an emission tax of p= (1 + ) per ton. This leads to …rms choosing the allocation (s ( ) ; e ( )) = sF B ( ) ; eF B ( ) which maximizes allocative e¢ ciency. The planner chooses transfers such that individual rationality and incentive compatibility constraints are satis…ed given the …rst best allocation. Transfers are allowed to depend on type. Hybrid: emission tax with opt-out

The planner o¤ers …rms two choices: IN or OUT. If the …rm

chooses IN, it is required to pay an emission tax of p= (1 + ) per ton and in return will be provided a transfer t . The transfer does not depend on the …rm’s type unlike in the previous emission tax regime. If

43

the …rm chooses OUT, it is required to set s ( ) equal to the …rst best emission rate of the most ine¢ cient type which I de…ne as . The …rm is paid a transfer equal to the total cost (including disutility) of , i.e. exp

eF B

C sF B

+

eF B

.

Figure 6 summarizes the hybrid contract. Figure 6: Hybrid regime

6.2

Results and analysis

I examine welfare gains under three di¤erent values for the constant marginal damage: p = 100, 300 and 1000. The range of emission permit prices during Phase I was about $60 to about $300 per ton. Moreover, the range of emission tax rates under the proposed Sulfur and Nitrogen Emissions Tax Act of 1987 (H.R. 2497) is $300 to $900 per ton. Thus these choices of constant marginal damages are reasonable approximations of what policy-makers had in mind with respect to marginal damage from SO2 emissions. Finally I look at two values for the cost of public funds:

= 0:3 and 0:7. The value

= 0:3 comes from

estimates of the cost of public funds for the US in the public …nance literature (La¤ont, 2005; Ballard, Shoven and Whalley, 1985) while

= 0:7 re‡ect an environment where taxes are di¢ cult to collect.

Tables 6 and 7 present the welfare gains and mean emission rates under the di¤erent regulatory regimes and parameter constellations. Welfare gains measure the improvement in welfare under the regime when compared to a uniform emission standard. These gains are in millions of 1995 dollars and intepreted as the average annual gain per …rm. The mean emission rates are in terms of lbs/MMBtu. All of these measures re‡ect averaging across types. Annual welfare gains range from $32M to $155M per …rm. These gains represent about 10% (i.e. 32/330) to 47% of the average O&M variable cost in my sample of electric utilities. Welfare gains 44

Table 6: Welfare gains

n

( ;pa )

f W

f W ($ M ) UE

(0:3; 100)

(0:3; 300)

(0:3; 1000)

(0:7; 100)

(0:7; 300)

(0:7; 1000)

Emiss std = E¢ cient

Full Info

104.4

136.6

178.7

215.4

280.6

370.0

Opt Reg

32.1

54.1

72.9

55.8

108.8

155.4

Tax

25.4

41.4

54.6

33.3

73.9

97.3

Hybrid

23.0

40.3

53.1

29.1

74.9

98.6

Notes: Welfare gains are relative to a uniform emission standard that maximizes allocative e¢ ciency.

Table 7: Mean emission rates and e¢ cient uniform standard

n

( ;pa )

(0:3; 100)

(0:3; 300)

(0:3; 1000)

(0:7; 100)

(0:7; 300)

(0:7; 1000)

3.41

2.02

0.86

4.10

2.60

1.27

Tax

2.90

1.40

0.55

3.31

1.72

0.68

Hybrid

3.05

1.85

0.73

3.77

2.27

0.90

E¢ cient std

3.75

1.61

0.63

4.61

1.98

0.78

lb s/ M M B tu

Opt Reg

increase with both the constant marginal damage parameter p and the cost of public funds . The main weakness of a uniform emission standard is its lack of ‡exibility in terms of emission allocations across heterogeneous …rms. The gains from ‡exibility that optimal pollution regulation is able to achieve comes from two sources. First, a more ‡exible emission allocation scheme increases allocative e¢ ciency, i.e. the proper balance between marginal damages from emissions and marginal abatement costs. More ine¢ cient …rms have higher abatement costs so less abatement is required for these types. Second, ‡exibility allows the planner to reduce information rents by lowering abatement levels for types that have larger impacts on overall information rents. Lowering required abatement for ine¢ cient types lowers the reward of more e¢ cient types from claiming to be ine¢ cient, hence less information rents have to be paid. Figure 7 shows the division of welfare gains into these two sources. Almost all of the gains from ‡exibility come from reduction in information rents. Information rents are large under the uniform standard because ine¢ cient types are required to abate the same level as e¢ cient types. The more stringent the standard is, the larger are the information rents. The gain from allocative e¢ ciency is bounded above by the di¤erence in allocative e¢ ciencies under the …rst best allocation and under the e¢ cient uniform standard, i.e. " ! # 1 X FB 1 + (1 + j s j) s ( ) s D j sj N 2

45

where D > 0 is the equivalent marginal damage from an increase in emission rates. This upperbound only depends on the di¤erence between the mean emission rate under the …rst best allocation and the e¢ cient uniform standard. When this gap is small, the gains from allocative e¢ ciency are also small. Figure 7: Welfare gains due to allocative e¢ ciency vs gains due to information rent extraction

La¤ont (1994) suggests that optimal pollution regulation can be implemented using di¤erentiated emission taxes and transfers. For example, the transfer will be a function of the …rm’s reported cost while the tax will be a function of reported emission rate. Each type reports di¤erent combinations of cost and emission rate, and in return receives di¤erent transfers and faces di¤erent tax rates. When the number of types is large, such a policy would be di¢ cult to implement. An interesting question then is how well do simpler regulatory regimes perform? I …rst look at a uniform emission tax regime that provides di¤erentiated transfers to …rms. The welfare gains under this regime is the upperbound of the class of regimes with uniform emission taxes since this has the most ‡exible compensation scheme. Second, I consider a hybrid regime where …rms can choose either to participate in the uniform emission tax regime or to opt-out and join a lenient emission standard. If the …rm decides to pay emission taxes, it receives a transfer that is not di¤erentiated across types. If the …rm opts out, then it will be required to have the …rst best emission rate of the most ine¢ cient …rm. In exchange it receives a transfer equal 46

Table 8: Percent of welfare gains from optimal regulation captured by simple contracts %

n

( ;pa )

(0:3; 100)

(0:3; 300)

(0:3; 1000)

(0:7; 100)

(0:7; 300)

(0:7; 1000)

Emiss std = E¢ cient

Tax

79.1

76.5

74.9

59.7

67.9

62.6

Hybrid

71.7

74.5

72.8

52.2

68.8

63.4

Table 9: Opt-out emissions standard lb s/ M M B tu

n

( ;pa )

Opt-out std

(0:3; 100)

(0:3; 300)

(0:3; 1000)

(0:7; 100)

(0:7; 300)

(0:7; 1000)

7.00

5.25

2.08

7.00

6.46

2.55

to the cost of the most ine¢ cient …rm. Table 8 shows how much of the welfare gains from optimal regulation is captured by simpler regimes. The opt-out emission rates are given by table 9. The emissions tax regime can capture from 60% to 80% of the welfare gains from optimal regulation. These numbers indicate that a uniform emission tax regime can yield welfare gains that are not signi…cantly far from the more complicated optimal mechanism. Although allocations are decentralized in the emission tax regime, it is complicated to implement since transfers are type-dependent. The hybrid regime is basically an emission tax regime with a typeindependent transfer so it is a simpler alternative. A hybrid regime with 100% participation (no opt-out) is clearly welfare dominated by the tax regime with di¤erentiated transfers because the planner leaves higher information rents in the former. The nice thing about the hybrid regime is that it can lower information rents by allowing …rms to opt-out. However opt-out distorts allocative e¢ ciency and so has a negative e¤ect on welfare. It turns out that if the gains from lowering information rents is su¢ ciently large relative to the loss from allocative e¢ ciency distortions, then it is possible that the hybrid regime can do better than the uniform emission tax regime with di¤erentiated transfers. One such case is when is large. Table 8 shows that when

= 0:7 and p

tax regime with di¤erentiated transfers. When

300, the hybrid is actually better than the emission = 0:3, the hybrid regime is worse however the gap is

not huge. The intuition for why the hybrid regime works is precisely the intuition for optimal regulation: there is a tradeo¤ between allocative e¢ ciency and information rent extraction. The hybrid regime can be seen as a binary menu in the spirit of Rogerson (2003) and Chu and Sappington (2007). Figure 8 plots

47

the cdf of the distribution of emission rates s ( ) under di¤erent regimes. The hybrid regime basically approximates the distribution under optimal regulation in a limited way. Figure 8: CDF of emission rates under di¤erent regimes ( = 0:7; p = 300) Distribution of emission rates (

λ=0.7, p=300)

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2 Opt Reg First best Hybrid Uniform standard

0.1

0

7

0

1

2

3

4

5 Emission rate

6

7

8

9

10

Conclusion

Annual welfare gains from optimal pollution regulation relative to an e¢ cient uniform emission standard range from $32 million to $155 million per electric utility, or about 10% to 47% of electricity generation costs. The optimal form of regulation can be theoretically implemented by designing a menu of typedependent emission tax rates and transfers. A simpler and more practical way to allocate emission rates is through a uniform emission tax. A regime with a uniform emission tax and type-dependent transfers can capture from 60% to 80% of these gains. However this still requires the social planner to design transfers that depend on the …rm’s type. I consider a hybrid regime where both the emission tax and transfer are uniform but allows …rms to opt-out and join a uniform emission standard. The hybrid regime captures from 52% to 75% of the welfare gains and can even do better than the more complicated emission tax regime if the cost of public funds is high. I use a model of rate regulation to identify and estimate the …rm’s hidden type and disutility from 48

exerting e¤ort. In the model and analysis, I did not explicitly model capital choice and how it can be used as a signal during the rate case. The primary mode of compliance during the time period I study was fuel-switching so capital-based compliance methods played a smaller role. However, more recent data re‡ects greater popularity of capital-based compliance methods hence, explicitly modeling capital choice is important. Future research will deal with this more general case. The optimal mechanism in this case is the solution to a non-separable multidimensional screening problem. While this is a complicated problem to solve analytically, numerical methods can be used with a discretized type space. Another avenue for future research is to compare my estimates with estimates from a normative model. The nonparametric identi…cation strategy developed recently by Perrigne and Vuong (2011) can be used to estimate the La¤ont and Tirole’s (1986) normative model. A formal econometric test along the lines of Vuong (1989) and Smith (1992) can be used to assess which model is a better …t to the data. Finally, regulation in my case is static. One reason for this is that Title IV, as originally conceived, is a long-term program. However EPA, starting in 2005, decided to redesign the program to take into account the cross-state transport e¤ects of SO2 emissions. In doing so, it updated the estimates of marginal abatement costs. Some …rms and states sued the EPA and up until today, the future of this policy is largely uncertain. Because of the ability of the regulator to update its information about the …rms and change the policy accordingly, it would be more expensive to incentivize …rms to reveal their types. Thus, policies that reduce information rents would probably yield higher welfare than those that focus on allocative e¢ ciency.

References Ackerberg, D. A., K. Caves and G. Frazer (2006), “Structural Identi…cation of Production Functions,” mimeo, UCLA. Alt, L. E. (2006), Energy Utility Rate Setting: A Practical Guide to the Retail Rate-Setting Process for Regulated Electric and Natural Gas Utilities, Morrisville, NC: Lulu.com. Arellano, M. (2003) Panel Data Econometrics, Oxford: Oxford University Press. Arellano, M. and S. Bonhomme (2012), “Identifying Distributional Characteristics in Random Coe¢ cients Panel Data Models,” Review of Economic Studies, 79:3, 987-1020. Arellano, M. and B. Honoré (2001), “Panel data models: some recent developments,”in J.J. Heckman and E.E. Leamer (ed.), Handbook of Econometrics, Vol. 5, Amsterdam: Elsevier Science.

49

Armstrong, M. and D. E. M. Sappington (2007), “Recent Developments in the Theory of Regulation,” in M. Armstrong and R. Porter (ed.), Handbook of Industrial Organization, Vol. 3, Amsterdam: Elsevier Science. Asker, J. (2010), “A Study of the Internal Organization of a Bidding Cartel,” American Economic Review, 100:3, 724-762. Bailey, E. E. and R. D. Coleman (1971), “The E¤ect of Lagged Regulation in an Averch-Johnson Model,” Bell Journal of Economics, 2:1, 278-292. Ballard, C. L., J. B. Shoven and J. Whalley (1985), “General Equilibrium Computations of the Marginal Welfare Costs of Taxes in the United States,” American Economic Review, 75:1, 128-138. Banks, J. S. (1992), “Monopoly Pricing and Regulatory Oversight,”Journal of Economics & Management Strategy, 1:1, 203-233. Banks, J. S. and J. Sobel (1987), “Equilibrium Selection in Signaling Games,” Econometrica, 55:3, 647-661. Baron, D. P. and D. Besanko (1984), “Regulation, Asymmetric Information, and Auditing,” RAND Journal of Economics, 15:4, 447-470. Baron, D. P. and R. B. Myerson (1982), “Regulating a Monopolist with Unknown Costs,”Econometrica, 50:4, 911-930. Baumol, W. J. and A. K. Klevorick (1970), “Input Choices and Rate-of-Return Regulation: An Overview of the Discussion,” Bell Journal of Economics and Management Science, 1:2, 162-190 Beran, R. and P. Hall (1992), “Estimating Coe¢ cient Distributions in Random Coe¢ cient Regressions,” Annals of Statistics, 20:4, 1970-1984. Besanko, D. (1985), “On the Use of Revenue Requirements Regulation Under Imperfect Information,” in M. A. Crew (ed.), Analyzing the Impact of Regulatory Change in Public Utilities, Lexington, MA: Lexington Books, 39-55. Besanko, D. and D. F. Spulber (1992), “Sequential-Equilibrium Investment by Regulated Firms,” RAND Journal of Economics, 23:2, 153-170. Bonhomme, S. and J.-M. Robin (2010), “Generalized Nonparametric Deconvolution with an Application to Earnings Dynamics,” Review of Economic Studies, 77:2, 491-533.

50

Brocas, I., K. Chan and I. Perrigne (2006), “Regulation under Asymmetric Information in Water Utilities,” American Economic Review, Papers and Proceedings, 96, 62-66. Bushnell, J. and C. Wolfram (2007), “The Guy at the Controls: Labor Quality and Power Plant E¢ ciency,” NBER working paper 13215. Carlson, C. P., D. Butraw, M. Cropper and K. Palmer (2000), “SO2 Control by Electric Utilities: What are the Gains from Trade,” Journal of Political Economy, 108:6, 1292-1326. Chan, G., R. Stavins, R. Stowe and R. Sweeney (2012), “The SO2 Allowance-trading System and the Clean Air Act Ammendments of 1990: Re‡ections on 20 Years of Policy Innovation,”National Tax Journal, 65:2, 419-452. Chan, H. S., H. Fell, I. Lange and S. Li (2012), “E¢ ciency and Environmental Impacts of Electricity Restructuring on Coal-…red Power Plants,” mimeo, Univ of Maryland. Chu, L. Y. and D. E. M. Sappington (2007), “Simple Cost-Sharing Contracts,” American Economic Review, 97:1, 419-428. Cramton, P. and S. Kerr (2002), “Tradeable Carbon Permit Auctions: How and why to auction not grandfather,” Energy Policy, 30, 333-345. Ellerman, A. D., P. L. Joskow, R. Schmalensee, J.-P. Montero and E. Bailey (2000), Markets for Clean Air: The U.S. Acid Rain Program, Cambridge, UK: Cambridge University Press. Environmental Protection Agency (2001), Acid Rain Program: 2001 Progress Report, EPA-430-R-02009.

Environmental Protection Agency (2007, June 8), What is Acid Rain? Retrived from http://www.epa.gov/acidrain/ Environmental Protection Agency (2009, May 13) E¤ ects of Acid Rain - Human Health, Retrived from http://www.epa.gov/acidrain/e¤ects/health.html. Evdokimov, K. (2008), “Identi…cation and Estimation of a Nonparametric Panel Data Model with Unobserved Heterogeneity,” mimeo, Yale University. Evdokimov, K. (2010), “Nonparametric Identi…cation of a Nonlinear Panel Model with Application to Duration Analysis with Multiple Spells,” mimeo, Princeton University. Evdokimov, K. and H. White (2012), “Some Extensions of a Lemma of Kotlarski,”Econometric Theory, 28:4, 925-932. 51

Fabrizio, K., N. Rose and C. Wolfram (2007), “Do Markets Reduce Costs? Assessing the Impact of Regulatory Restructuring on US Electric Generation E¢ ciency,”American Economic Review, 97:4, 1250-1277. Feller, W. (1971), An Introduction to Probability Theory and its Applications, Vol 2, New York: Wiley. Fowlie, M. (2010), “Emissions Trading, Electricity Industry Restructuring, and Investment in Pollution Control,” American Economic Review, 100:3, 837-869. Fowlie, M. and N. Muller (2012), “Market-based emissions regulation when damages vary across sources: What are the gains from di¤erentiation?” mimeo, UC Berkeley. Gagnepain, P. and M. Ivaldi (2002), “Incentive Regulatory Policies: The Case of Public Transit Systems in France,” RAND Journal of Economics, 33:4, 605-629. Gandhi, A., S. Navarro and D. Rivers (2011), “On the Identi…cation of Production Functions: How Heterogenous is Productivity,” mimeo, Univ of Wisconsin-Madison. Goulder, L. H., I W. H. Parry and D. Butraw (1997), “Revenue-raising versus Other Approaches to Environmental Protection: The Critical Signi…cance of Preexisting Tax Distortions,” RAND Journal of Economics, 28:4, 708-731. Hausdor¤, F. (1923), “Momentprobleme für ein endliches Intervall”, Mathematische Zeitschrift, 16, 220-246 Hendel, I. and A. Nevo (2012), “Intertemporal Price Discrimination in Storable Goods Markets,” mimeo, Northwestern Univ. Horowitz, J. L. and M. Markatou (1996), “Semiparametric Estimation of Regression Models for Panel Data,” Review of Economic Studies, 63:1, 145-168. Joskow, P. L. (1974), “In‡ation and Environmental Concern: Structural Change in the Process of Public Utility Price Regulation,” Journal of Law and Economics, 17:2, 291:327. Joskow, P. L. (2008), “Incentive Regulation and its Application to Electricity Networks,” Review of Network Economics, 7:4, 547-560. Joskow, P. L. and R. Schmalensee (1998), “The Political Economy of Market-Based Environmental Policy: The U.S. Acid Rain Program,” Journal of Law and Economics, 41:1, 37-83.

52

Kahn, A. E. (1988), The Economics of Regulation: Principles and Institutions, Cambridge, MA: MIT Press. Kotlarski, I. I. (1967), “On Characterizing the Gamma and Normal Distribution,” Paci…c Journal of Mathematics, 20, 69-76. Krasnokutskaya, E. (2011), “Identi…cation and Estimation of Auction Models with Unobserved Heterogeneity,” Review of Economic Studies, 78:1, 293-327. La¤ont, J.-J. (1994), “Regulation of Pollution with Asymmetric Information,”in C. Dosi and T. Tomasi (eds.), Nonpoint Source Pollution Regulation: Issues and Analysis, Kluwer Academic Publishers, 39-66. La¤ont, J.-J. (2005), Regulation and Development, Cambridge, UK: Cambridge University Press. La¤ont, J.-J. and J. Tirole (1986), “Using Cost Observation to Regulate Firms,” Journal of Political Economy, 94, 614-641. La¤ont, J.-J. and J. Tirole (1993), A Theory of Incentives in Procurement and Regulation, Cambridge, MA: MIT Press. Lazarev, J. (2011), “The Welfare E¤ects of Intertemporal Price Discrimination: An Empirical Analysis of Airline Pricing in U.S. Monopoly Markets,” mimeo, Stanford GSB. Leslie, P. (2004), “Price Discrimination in Broadway Theater,” RAND Journal of Economics, 35:3, 520-541. Levinsohn, J. and A. Petrin (2003), “Estimating Production Functions Using Inputs to Control for Unobservables,” Review of Economic Studies, 317-342. Lewis, T. R. (1996), “Protecting the Environment when Costs and Bene…ts are Privately Known,” RAND Journal of Economics, 27:4: 819:847. Miravete, E. J. (2007), “The Limited Gains from Complex Tari¤s,” mimeo, Univ of Texas-Austin. Li, T. and Q. Vuong (1998), “Nonparametric Estimation of the Measurement Error Model Using Multiple Indicators,” Journal of Multivariate Analysis, 65, 139-165. Li, T., I. Perrigne and Q. Vuong (2000), “Conditionally Independent Private Information in OCS Wildcat Auctions,” Journal of Econometrics, 98, 129-161.

53

Muller, N. and R. Mendelsohn (2009), “E¢ cient Pollution Regulation: Getting the Prices Right,” American Economic Review, 99:5, 1714-1739. National Acid Precipitation Assessment Program (2005), National Acid Precipitation Assessment Pro-

gram Report to Congress: An Integrated Assessment, Retrieved from http://ny.water.usgs.gov/projects/NAPAP/ Nelson, R. A. (1985), “Returns to Scale from Variable and Total Cost Functions,” Economic Letters, 18, 271-276. Olley, S. and A. Pakes (1996), “The Dynamics of Productivity in the Telecommunications Equipment Industry,” Econometrica, 64, 1263-1295. Pint, E. (1992), “Price-Cap Versus Rate-of-Return Regulation in a Stochastic-Cost Model,” Rand Journal of Economics, 23:4, 564-578. Perrigne, I. and Q. H. Vuong (2011), “Nonparametric Identi…cation of a Contract Model With Adverse Selection and Moral Hazard,” Econometrica, 79:5, 1499-1539. Perry, R. H., D. W. Green, J. O. Maloney (eds.) (1997), Perry’s Chemical Engineers’ Handbook, McGraw-Hill. Rao, B. L. S. Prakasa (1992), Identi…ability in Stochastic Models: Characterization of Probability Distributions, San Diego: Academic Press. Rogerson, W. (2003), “Simple Menus of Contracts in Cost-Based Procurement and Regulation,”American Economic Review, 93:3, 919-926. Schennach, S. M. (2004), “Estimation of Nonlinear Models with Measurement Error,” Econometrica, 72:1, 33-75. Schmalensee, R. and R. N. Stavins (2012), “The SO2 Allowance Trading System: The Ironic History of a Grand Policy Experiment,”Harvard Kennedy School Faculty Working Paper Series, RWP12-030. Shohat. J. A. and J. D. Tamarkin (1943), The Problem of Moments, Mathematical Surveys Number 1, Providence, RI: American Mathematical Society. Smith, R. J. (1992), “Non-Nested Tests for Competing Models Estimated by Generalized Method of Moments,” Econometrica, 60:4, 973-980. Spulber, D. F. (1998), “Optimal Environmental Regulation under Asymmetric Information,” Journal of Public Economics, 35:2, 163-181. 54

Villas-Boas, S. B. (2009), “An Empirical Investigation of the Welfare E¤ects of Banning Wholesale Price Discrimination,” RAND Journal of Economics, 40:1: 20-46. Vuong, Q. H. (1989), “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses”, Econometrica, 57:2, 307-334 Wolak, F. A. (1994), “An Econometric Analysis of the Asymmetric Information, Regulator-Utility Interaction,” Annales d’Economie et de Statistiques, 34, 13-69.

Appendix Proof of propositions Proof of Proposition 1 b the …rm chooses R b Since the equilibrium auditing strategy of the regulator is strictly increasing in R, according to the markup equation

b =R+ 1 R

.

b R

b so I can write equilibrium R b as just The regulator’s equilibrium auditing strategy is not a function of C b (R). The equilibrium strategy is increasing in R since U b > 0 and is supermodular. Next, since is R RR b the …rm does not have any incentive to exert e¤ort during the rate case so e1 = 0 not a function of C, for all ( ; R). After the rate case, e2 equates the marginal disutility of e¤ort with the marginal bene…t. Notice that e2 is just a function of .

On the equilibrium path, % = R so the regulator’s …rst order condition becomes b A0 ( ) = 2 R

R .

b satis…es the markup equation so I can rewrite the regulator’s FOC as The …rm’s equilibrium choice of R A0 ( ) = 2

Since

1

.

b R

b this is just a separable ordinary di¤erential equation: is not a function of C, Z 0 A ( ) b d = 2R. 1

(14)

O¤-equilibrium proposals can be grouped into two. The …rst group involves the …rm still proposing b (R) but some other C b 6= exp ( ) C. The second group involves the …rm proposing a di¤erent equilibrium R 55

b i.e. type R proposes R b= b (R). For the …rst group, since is not a function of C, b o¤-equilibrium R, 6 R b does not change the regulator’s behavior. Given this, the …rm does not have proposals involving just C an incentive to deviate from exerting zero e¤ort during the rate case. b R b (RU ). This means that the type R …rm is proposing For the second group, consider deviations R b (R0 ), and be audited as if the …rm was R0 . This is not a pro…table deviation someone else’s proposal, say R

for the …rm since it is interior but does not satisfy the markup equation for type R. Now suppose the b > R b (RU ). The equilibrium auditing strategy de…ned in equation …rm deviates by proposing some R b>R b (RU ). For these proposals, the auditing strategy treats the …rm as if it were a type (14) allows R

above RU and remains to be strictly increasing. Since the …rm’s optimal proposal is increasing in R and R

RU , this deviation is not pro…table for type R.

Proof of Proposition 2 [[TBD]] bU and types di¤er only on their In this equilibrium, all types ( ; R) propose the highest possible RRB, R b Given the regulator’s equilibrium auditing strategy , a type ( ; R) …rm reported operating cost, C. chooses e1 = e1 ( ; R) such that h 2

b C

bU R

i 1 exp (

R

e1 ) C =

0

(e1 ) .

(15)

b ( ; R) = exp ( The reported operating cost of ( ; R) is thus C

e1 ( ; R)) C. Since the cardinality of the b we necessarily have pooling of type space [0; U ] [0; RU ] is larger than that of the message space of C, b Formally, the set of types that report a given C b in equilibrium is given by types at di¤erent values of C. ) ! ( i h b C 0 b . bU R b b = ( ; R) : (16) R 1 C = 2 Cb C ln T C C To show that e1 ( ; R) > 0 for any ( ; R), I assume

(e) = exp (e) for convenience. Note that I have

used a similar exponential form for in the empirical part of the paper. Given this functional form, I b as “iso-cost” curves in R can represent T C space: = ln

( h

2

b C

b C

bU R

R

1

iC b2 C

)

.

First, I want to show that the marginal bene…t of exerting e¤ort is strictly positive, i.e. h i b bU R 2 Cb C R 1 >0 or equivalently,

bU R ln To show the …rst requirement, recall that ( h b ln 2 Cb C

De…ne R as the value of R such that ( h ln 2 Note that R

hence

bU R

0.

)

= 0.

R

R

iC b2 1 C

bU R

b C

b C

)

iC b2 1 C

R since the left-hand side of the inequality is decreasing in R. From this equation we get 1

bU R =R

2

b C

b C

2

b C

1

bU R ln C

3

.

I now discuss the regulator’s equilibrium auditing strategy. On the equilibrium path, the regulator chooses

such that A0 ( ) =

Z

b 2 R

Using equation (17) gives A0 ( ) = Let

8 Z < :

1 b C

b C

\ exp ( )= Then

is the solution to

Z

R dF

; Rj ( ; R) 2 T 9 =

C exp ( ) + 1 dF b ; C exp ( ) dF

b C

.

b C

; Rj ( ; R) 2 T

b C

; Rj ( ; R) 2 T

h i \ b+C b+ A ( ) = exp ( )C ln C

.

.

\ for some constant of integration . O¤-equilibrium, exp ( ) is based on some arbitrary belief. Proof of Proposition 3 b and C b 0 with C b>C b 0 . Let R be the R-type that picks R; b C b in the data and similarly 1. Consider C b C b 0 . Using the de…nition of R0 be the R-type that picks R; b C b R;

b C b0 R;

=

=

n

b C b R;

b C b R;

h b R

h b b C b0 R R; oh i b C b0 b R R; R

R

i

where the second line comes from adding and subtracting T T0

n R : pick n = R : pick

=

gives

b C b R;

b C b0 R;

b C b0 R;

o

h b R

R0

i

b C b0 R;

(18) R

R0

(19)

i R . Let

o

Note T and T 0 can be nonsingleton sets. Let % and %0 be the corresponding “beliefs” about

R for each signal. From the regulator’s optimal auditing strategy and strict convexity of A ( ), b C b > b C b 0 if and only if % < %0 . The inequality % < %0 is equivalent to R; R; we have

max fT g < min fT 0 g since pooling sets are intervals. Finally, max fT g < min fT 0 g is equivalent to 58

b C b > R;

R < R0 since R 2 T and R0 2 T 0 . Therefore

this to equation (19) yields

b C b = R;

b C b0 , R;

b C b 0 if and only if R < R0 . Applying R;

b C b = R;

2. The following lemma is useful for the proof:

b C b0 . R;

Lemma 1 Suppose there exist two distinct R-types R and R00 , that pick b=R bU . Then R

b C b R;

in equilibrium.

b R b>R b0 . Let R be the R-type that picks R; b C b Consider R b0 ; C b . Using the de…nition of similarly R0 be the R-type that picks R h i b0 ; C b R b R give R b C b R;

b0 ; C b = R

n

b C b R;

oh b R

b0 ; C b R

i R +

b0 ; C b R

i R00 .

in the data and

and adding and subtracting

h

b R

R

b0 R

R0

i

.

bU > R b>R b0 , we can conclude that R is the only R-type that Using lemma 1 and the fact that R b C b and R0 is the only R-type that picks R b0 ; C b in equilibrium. Thus the regulator’s picks R; belief pins down R and R0 , i.e. % = R and %0 = R0 . Since b %>R b0 %0 , we have R

Identifying and estimating

b C b > R;

and R

b0 ; C b , R

b C b > R;

b C b R;

b0 ; C b . R

If we knew the function , then we can get R from the markup equation: 1

b R=R

. b R

The main task then is to identify and estimate . Consider the disallowance b =R 59

R

>

b b0 ; C R

if and only if

which is part of the data. Using the de…nition of R in the model, we can link b R

= b of the …rm satis…es The equilibrium R

b R

R .

1

R=

(1

=

to this ODE is

b in this equilibrium. The solution is not a function of C

For estimation, I approximate the function

e¤ects.

.

b R

b = R so that I can compute

)

n h b = 1 + exp R

where

. b R

Thus we have the following di¤erential equation:

This is an ordinary di¤erential equation since

with :

Z

1 b R

b R

io

1

b dR

b by a linear function in R, b i.e. R

b easily. To get the coe¢ cients, I regress R

b = a0 + a1 R, b R

b and …rm, state and year on R

Characterization of optimal pollution regulation f subject to individual rationality (IR) and incentive compatibility The regulator maximizes welfare W (IC) constraints. Although the original type space is two-dimensional, there is no instrument to screen

R-types. Thus all …rms will pool at the highest possible R. I solve the problem as a one-dimensional screening problem since R does not a¤ect welfare comparisons (except for the full information regime). The distribution of types is discrete so I adapt standard methods for continuous types (e.g. La¤ont and Tirole, 1993; La¤ont, 1994) to my setting. The …rst step is to reduce the set of IC constraints into upward local ICs. I solve the problem in terms of …rms’pro…ts instead of transfers. For any type requires i

j

+ [exp

j

exp i ] exp ( ej ) sj s

j

i

[exp

j

exp i ] exp ( ei ) si s .

Combining these, we get exp ( ei ) si s

exp ( ej ) sj s . 60

i

and

j,

IC

As long as (s; e)’s satisfy this inequality, we can focus on upward local ICs. I solve the reduced problem and check this inequality ex-post. By standard arguments, the IR of the most ine¢ cient type will be binding while the ICs of the rest of the types will be binding. Thus i

=

N i+1

= 0 and for i = 1; 2; :::; N

+ [exp

1,

s exp i ] exp ( ei+1 ) si+1 .

i+1

Given these, I can rewrite the regulator’s objective function as ( N X Dsi (ei ) 1 f= W N [exp i + (i 1) (exp i exp i i=1

1 )] exp (

ei ) si s

)

where D > 0 is the marginal damage from an increase in the emission rate. The …rst order condition with respect to si is D = e [exp

i

+ (i

1) (exp

exp

i

i 1 )] exp (

ei ) si s

1

.

This FOC di¤ers from the FOC for the …rst best emission rate because the regulator takes into account the e¤ect of si on the incentives of types j = 1; 2; 3; :::; i

1 to reveal their type. An increase in si

increases the required pro…ts that the regulator has to give to all types that are more e¢ cient than

i

in

the “second best” world. The …rst order condition with respect to ei is 0

(ei ) = [exp

i

+ (i

1) (exp

i

exp

i 1 )] exp (

ei ) si s .

and the same comments apply. De…ne

Using the functional form34 of

FB

= exp

i

OR

= exp

i

+ (i

1) (exp

i

exp

i 1) .

to compute optimal e¤ort, the FOC with respect to si becomes D=e (

) 1+ si

s 1+

1

OR

) 1+ si

s 1+

1

FB

.

For the …rst best emission rate, the FOC is

Since

FB

Suggest Documents