Does Teacher Sorting across and within Schools Explain Inequality in Education Outcomes?

Does Teacher Sorting across and within Schools Explain Inequality in Education Outcomes? Petra Thiemann Working Paper Series n. 99 ■ August 2017 S...
1 downloads 0 Views 7MB Size
Does Teacher Sorting across and within Schools Explain Inequality in Education Outcomes?

Petra Thiemann

Working Paper Series n. 99 ■ August 2017

Statement of Purpose The Working Paper series of the UniCredit & Universities Foundation is designed to disseminate and to provide a platform for discussion of either work of UniCredit economists and researchers or outside contributors (such as the UniCredit & Universities scholars and fellows) on topics which are of special interest to UniCredit. To ensure the high quality of their content, the contributions are subjected to an international refereeing process conducted by the Scientific Committee members of the Foundation. The opinions are strictly those of the authors and do in no way commit the Foundation and UniCredit Group.

Scientific Committee Franco Bruni (Chairman), Silvia Giannini, Tullio Jappelli, Levent Kockesen, Christian Laux; Catherine Lubochinsky, Massimo Motta, Giovanna Nicodano, Marco Pagano, Reinhard H. Schmidt, Branko Urosevic.

Editorial Board Annalisa Aleati Giannantonio De Roni

The Working Papers are also available on our website (www.unicreditanduniversities.eu)

WORKING PAPER SERIES N.99 - AUGUST 2017

1

Contents 1 Introduction

4

2 The sorting problem

6

2.1 General setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.2 Teacher sorting: Institutional context and expected sorting patterns . . . . . . . . . . . . . . .

8

3 Empirical approach

9

3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

3.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4 Data

18

5 Results

20

5.1 Descriptive evidence of variance contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 Event study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.3 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.4 Spatial heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.5 Simulations of counter-factual assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6 Conclusion

28

WORKING PAPER SERIES N.99 - AUGUST 2017

2

Does Teacher Sorting across and within Schools Explain Inequality in Education Outcomes? Petra Thiemann⇤1,2 1

2

Lund University USC Dornsife INET and IZA August 2017

Abstract Inequality in student access to high-quality teachers is frequently cited as a main source of inequality in students’ educational outcomes. This paper studies to what extent better teachers are matched with better classrooms (positive “assortative matching”) and quantifies the contribution of assortative matching to test score inequality across classrooms. I use data on the universe of elementary schools in North Carolina (grades 3-5) over a period of 15 years (1997-2011). Assortative matching between teachers and classrooms explains about 5 percent of the test score variance across classrooms. The amount of sorting within schools is about as large as the amount of sorting across schools. By means of a simulation, I compare the test score distribution under the observed teacher allocation to the test score distribution under a counterfactual random allocation of teachers across classrooms. I find that assortative matching explains 11-16 percent of the performance gap between a classroom at the 90th percentile and a classroom at the 10th percentile of the test score distribution. Furthermore, sorting patterns vary across geographic locations: Assortative matching between teachers and classrooms matters the most in rural areas, but the least in urban areas. In urban areas, by contrast, assortative matching between schools and students is one of the main drivers of education inequality. JEL codes: I20, I24, J24 Keywords: education, inequality, K-12, teacher quality, teacher switching, student sorting, matching

⇤ Contact: Lund University, Department of Economics, School of Economics and Management, P.O. Box 7082, SE-220 07 Lund, Sweden, Phone: +46 (0)46 222 8657, E-mail: [email protected]. Petra Thiemann was a Postdoctoral Research Associate at USC Dornsife INET while working on this paper. The paper greatly benefited from helpful discussions with Khai X. Chiong, Bryan S. Graham, Matthew Kahn, Michael Leung, Elena Manresa, Hyungsik Roger Moon, Geert Ridder, Jorge De La Roca, Jorge Tamayo, Valentin Verdier, Jenny Williams, as well as seminar and conference participants. The paper was presented at the University of Southern California, the North American Summer Meeting of the Econometric Society, and the EEA conference 2017. I thank Young Miller from the USC Economics Department, Kara Bonneau from the North Carolina Education Research Data Center at Duke, and Jeremy Holt from USC Dornsife IT Services for their generous administrative support. I acknowledge funding from the USC Dornsife Institute for Economic Thinking and the USC Department of Economics. This paper won the EEA Young Economist Best Paper Award in August 2017.

WORKING PAPER SERIES N.99 - AUGUST 2017

3

1 Introduction Teacher quality influences both student test scores and outcomes later in life (Chetty et al., 2014b). Not all students, however, have the same access to high-quality teachers. For example, students from minority or disadvantaged backgrounds are on average matched with teachers with lower levels of experience or preparation (Clotfelter et al., 2005, 2007, Lankford et al., 2002). Differences in access to high-quality teachers can thus give rise to inequalities in education outcomes (c.f. Reardon et al., 2016). This paper provides direct evidence on the relationship between teacher sorting—across and within schools—and inequalities in student test score outcomes. Do better teachers work in better schools, i.e. schools that are better equipped, and/or schools with better students? Within schools, are better teachers assigned to classrooms with better-prepared students? To what extent do these sorting patterns exacerbate the performance gap between classrooms? This paper focuses on teacher sorting across and within elementary schools. Given that students form important general skills and habits early in life, this stage of schooling can prove particularly important (Cunha et al., 2006). The analysis uses data on public elementary schools in North Carolina over a period of 15 years (1997-2011) to study these questions. I study sorting on a teacher’s “value-added”, which defines a teacher’s quality as his/her ability to consistently improve a classroom’s test performance.1 Teacher quality is difficult to observe and difficult to proxy with observable characteristics. For example, a teacher’s education or experience hardly predicts differences in test score gains between classrooms (Rivkin et al., 2005, Rockoff, 2004), a finding that I replicate in the present study. This is because classroom interactions are complex and require specific skills that are difficult to measure. Most data sets, however, do not contain detailed information on classroom interactions.2 Teacher value-added, by contrast, recovers a teacher’s quality based on longitudinal (panel) data. A teacher’s performance is measured in various context, i.e. in different classrooms and years, and ideally in different schools. To account for differences in contexts across teachers, one has to net out the effects of student composition and the characteristics of the school. Value-added is then defined as the average classroom performance gain that a teacher can achieve, independent of the contexts. In recent work, Chetty et al. (2014a,b) show that teacher value-added does not only predict a teacher’s classroom performance well, but also a teacher’s ability to persistently boost labor market earnings later in life. To separately identify teacher quality and school quality, the present paper relies on teacher switching 1 So far at least three papers study sorting on value-added: Sass et al. (2012) study sorting of teachers with higher value-added on low-poverty schools. Jackson (2013) studies match effects between schools and teachers, based on teacher value-added and unobserved school quality. Both papers use data on elementary schools in North Carolina. The present paper is closest to work by Mansfield (2015), who focuses on teacher sorting based on value-added in high schools in North Carolina. He concludes that sorting of teachers across and within schools is negligible at the high school level. 2 An exception for the US is the Measures of Effective Teaching Study, see, for example, Kane et al. (2013).

WORKING PAPER SERIES N.99 - AUGUST 2017

4

across schools within a set of schools that are connected through teacher switches.3 In a panel setting, teacher quality is recovered from a teacher fixed effect (i.e., the permanent component of teacher quality), and school quality is recovered from a school fixed-effect (i.e., the permanent component of school quality). The method implemented in this paper uses the identification results established by Abowd et al. (1999) in the context of worker-firm sorting (henceforth “AKM” framework). The teacher-classroom sorting problem differs from the worker-firm sorting problem in at least two ways: First, in the education setting, the sorting of students to schools and classrooms adds an additional layer to the problem. Thus, I incorporate student sorting on observed characteristics (baseline test scores, socioeconomic background) to both schools and teachers into the model. I allow teachers to sort on school quality as well as student quality, and I allow students to sort on teacher quality as well as school quality. Second, I use test score gains instead of wages as a performance measure, as teacher pay usually does not reflect performance differences. In contrast to wages, test scores gains are a more immediate and precise measure of productivity (Jackson, 2013). To understand the contribution of sorting to the education inequality, this paper uses the model estimates in order to perform a variance decomposition, as in the original AKM framework. This decomposition determines the relative importance of sorting along five dimensions: observable teacher characteristics (experience), unobservable teacher quality (value-added), observable school characteristics (student composition in terms of prior achievement and socio-economic characteristics), and observable student characteristics (prior achievement and socio-economic characteristics). The analysis is based on administrative data from the universe of public schools in North Carolina, for elementary school children (grades 3-5). The data cover a period of 15 years (1997-2011) and contain about 1,400 schools, and 33,000 teachers. As outcomes, I study the standardized end-of-year test scores in math and reading. In addition, the data contain a rich set of teacher, school, and student characteristics.4 The four main findings are as follows: First, teacher sorting both within and across schools contributes to inequality in test scores. Overall, higher-quality teachers are observed in classrooms with better-prepared students. The sorting of teachers to better schools and classrooms explains in total 5 percent of the variance in test score outcomes across classrooms. The sorting within schools proves almost as important as the sorting across schools. Second, to provide an alternative measure of sorting, I compare the outcome distribution under the current teacher assignment scheme to the distribution under a counter-factual assignment scheme that randomizes teachers within and across schools. If teachers were randomly allocated 3 Several papers in the economics of education literature exploit teacher switches across schools to separately identify teacher and school effects; see, for example, Jackson (2013), Bacher-Hicks et al. (2014), Chetty et al. (2014a), Mansfield (2015), Rothstein (2015). 4 Similar data sets have been constructed based on the NC data by Jackson (2013) to study match effects between teachers and schools, and by Rothstein (2015) to study biases in value-added estimates.

WORKING PAPER SERIES N.99 - AUGUST 2017

5

within schools—but not across schools—the gap between the classroom at the top decile of the test score distribution and the bottom decile of the test score distribution would shrink by 5-7 percent. If teachers were randomly assigned both within and across schools, the performance gap would narrow by 11-16 percent. Third, teachers do not sort on the unobserved dimension of school quality. If anything, teachers are observed in schools that are slightly worse in terms of unobserved quality. Fourth, students positively sort on unobserved school quality. The student-school sorting explains about 10-12 percent in the test score variance. In addition, I provide evidence on the heterogeneity of sorting across geographic areas (urban, suburban, rural). Student sorting to schools is particularly relevant in large cities, suburbs and towns. By contrast, teacher sorting is particularly relevant in rural areas and mid-size cities. In these areas, students may be more restricted in their school choice, but teacher sorting still proves important. In sum, changes in teacher assignment policies may be least effective in large cities, where test score outcomes are low, and more effective in rural areas. In big cities, school principals and district officials may focus on student sorting rather than teacher sorting in order to address inequalities in education outcomes. This paper contributes in three respects to the controversial debate on the impact of access to high-quality teachers on inequality in test scores. First, this paper provides a comprehensive study on the relationship between sorting and inequality, considering three types of sorting in a single setting: the sorting of (1) students to schools, (2) teachers to schools, and (3) teachers to classrooms. Second, based on a simulation of counter-factual assignment schemes, this paper establishes that teacher-classroom sorting explains an economically significant portion of the performance gap between classrooms at the elementary school level. Third, the paper provides evidence on heterogeneity of sorting patterns across regions, thus emphasizing the importance to focus on policies that are suitable given the features of the region under consideration. The paper proceeds as follows. Section 2 presents a stylized overview of the sorting problem as well as the institutional background for teacher assignment in North Carolina. Section 3 outlines the empirical approach, and Section 4 presents the data sets and variables. Section 5 describes and discusses the results. Section 6 concludes.

2 The sorting problem 2.1

General setup

This section presents a stylized overview of the sorting problem and provides information on the institutional background of the analysis. I consider the sorting of teachers to schools, classrooms, and students in public

WORKING PAPER SERIES N.99 - AUGUST 2017

6

elementary schools in North Carolina. In the data set, an observation is a match between student i, teacher j, and school s in a given year t; for this observations, I observe the match output yijst = m(i, j, s, t), i.e., the end-of-year test scores in math and reading for students in grades 3-5. The remainder of the paper studies how sorting can explain the variation in the match output across students. In particular, I study whether non-random sorting (“assortative matching”) across students, schools, and teachers exists, and whether assortative matching contributes to differences in the match output across students. The sorting problem is characterized by two different sorting markets that interact. One of the market regulates or coordinates the sorting of students (a) to schools and (b) to classrooms/teachers. The second market regulates or coordinates the sorting of teachers (a) to schools and (b) to classrooms. Both sorting markets are highly regulated by public institutions, i.e. by school districts as well as by the state (see also Section 2.2). Consider first teacher sorting, which proceeds in two steps. In the first step, teachers form a job-match with a school. Teachers can be new teachers as well as incoming teachers from a different school. Teacher turnover amounts to 6 percent per year in the data, i.e. job-to-job switches frequently occur. In the second step, school principals assign teachers to classrooms at the beginning of each school year. Consider now student sorting, which proceeds in three steps. In the first step, students sort to schools. In North Carolina, a student’s residential location largely determines his/her school choice, but students can opt out of the public school system and choose private schools, charter schools, or home schooling instead. Moreover, some districts allow students to opt out of their assigned public school to choose a different public school (Bifulco and Ladd, 2007, Bifulco et al., 2009, Jackson, 2009). In the second step, principals assign students to classrooms, and in the third step, the principals match each classroom with a teacher. Each teacher teaches exactly one self-contained classroom. The sorting of teachers to schools and classrooms as well as the sorting to students to schools and classrooms interfere with each other. A teacher may choose a school based on the expected composition of the student body as well as based on the expected composition of his classroom (Boyd et al., 2013). Similarly, families may choose their residential location or whether to opt out of the public school system based on the composition of the teacher work force in the desired school or school district. Moreover, parents may lobby in order to place their child with a certain teacher after enrollment. Thus, each agents’ choices depend on the choices of the other agents, which generates a complicated strategic interaction between all agents. The empirical model in this paper does not study the strategic interaction itself, but regards the observed match as an equilibrium outcome of this interaction. In other words, the empirical model describes the current equilibrium at the time of the data collection. Moreover, from the model estimates, I derive the impact of changes in sorting behavior in a partial equilibrium (i.e. changes in teacher sorting, holding student sorting WORKING PAPER SERIES N.99 - AUGUST 2017

7

constant). This paper, however, does not analyze the choices in a general equilibrium framework.

2.2

Teacher sorting: Institutional context and expected sorting patterns

Why do we expect assortative matching of teachers to schools and classrooms, and which economic mechanisms and institutions drive the sorting? The amount of sorting depends on the characteristics of the teacher labor market and of the public school system. The teacher labor market in North Carolina differs from traditional labor markets as the state restricts the wage setting of the schools (Clotfelter et al., 2011). In traditional labor markets, differential wages between jobs drive the sorting of workers to firms and jobs within these firms. If better workers have a comparative advantage in higher-paying jobs—in other words, if firm quality and worker quality display production complementarities—economic models predict assortative matching (Shimer and Smith, 2000, Lopes de Melo, 2016). In the teacher labor market in North Carolina, however, the wage setting is highly regulated. A state-wide pay schedule determines teacher pay based on experience and education (Clotfelter et al., 2011). School districts can pay a supplement to the state salary, but this supplement can only vary by experience and education. The schools cannot adjust teacher pay individually, neither within schools, nor at the school level. In a setting where wage differentiation is limited, teachers sort based on the observed characteristics of the students that they teach (Clotfelter et al., 2011, Boyd et al., 2013). In particular, teachers tend to favor schools and classrooms with higher abilities; moreover, teachers have preferences over other characteristics such as the racial or socio-economic composition of the school or classroom. Two mechanisms may drive the sorting of better teachers to classrooms with high-ability students or lower shares of poor students (Dieterle et al., 2015). First, classrooms and schools with high-ability students or students from low-poverty backgrounds may be easier or more rewarding to teach. As schools favor high-quality over low-quality teachers, high-quality teachers have a higher bargaining power to teach those more rewarding classrooms, compared to low-quality teachers. Thus, principals may use classroom assignments in order to reward or retain good teacher. Second, complementarities between classroom and teacher quality can drive the sorting of teachers in equilibrium. For example, better teachers may have a comparative advantage in teaching better classrooms. Thus, principals who care about the efficiency of the allocation (i.e., about aggregate test scores) will assign better teachers to better classrooms. The sign of the complemetarity, however, is a priori unknown. High-quality teachers may for example have a comparative advantage in teaching low-quality classrooms, or even mid-quality classrooms. The above mechanisms contribute to the assortative matching of teachers, schools, and classrooms, but

WORKING PAPER SERIES N.99 - AUGUST 2017

8

the sorting is unlikely to turn out as perfect positive assortative matching or perfect negative assortative matching, for a number of reasons. First, within schools, principals may want to reduce the inequality in student test scores. In this case, they may assign better teachers to disadvantaged students, and therefore counteract positive assortative matching, even at the expense of loosing the best teachers. Second, teachers’ preferences are heterogeneous, i.e. teachers differ with respect to the classrooms that they prefer to teach. While some of the teachers prefer students that are easy to teach, others care about disadvantaged students and therefore prefer to teach classrooms with low-ability or poor students. Third, location constraints, search costs, and job-switching costs add additional frictions, which have been well-studied in the labor literature (Mortensen, 1986, Mortensen and Pissarides, 1999). For example, a good teacher may prefer to teach in a high-quality school, but may not be willing to incur high costs of commuting or relocation (Boyd et al., 2013). In summary, the amount of teacher sorting that will be observed in the data depends a variety of channels and a complex interaction of them; ultimately, the amount and direction of the sorting is an open empirical question. Therefore, the next section turns to the empirical model, which quantifies the amount of sorting and its relation to test score inequality across classrooms.

3 Empirical approach 3.1

Model

I consider an educational production function, where the output variables are end-of-grade test scores of students in grades 3-5 in math and reading. This output depends on student inputs, teacher inputs, and school inputs:

yijst =

0 Xit | {z }

student quality

0 0 + µJ(i,t) + VJ(i,t)t + ↵S(i,t) + WS(i,t)t ⇢+ | {z } | {z } teacher quality

school quality

Dt0 + G0t ⇠ | {z }

+✏ijst ,

(1)

grade and year dummies

yijts is the test score of student i at time t in school s who is taught by teacher j. Three types of inputs enter into the production of test scores in an additively separable way. First, the outcome depends on the quality of student, which is modeled as a function of ability5 , socio-economic and demographic characteristics, Xit .6 Second, the model includes teacher quality, which has both an time-invariant component, also known as “value-added”, µJ(i,t) , and a time-varying component, measured through teacher experience, 5 Variables

include grade-3 pretest scores, classification as gifted, learning disadvantaged, English language learner. variables include gender, race, age, free/reduced-price lunch eligibility, parental education, gifted or learning disadvantaged, limited English proficiency. 6 The

WORKING PAPER SERIES N.99 - AUGUST 2017

9

VJ(i,t)t . Third, independent of teacher quality, the school provides several inputs such as the school administration and facilities. The time-invariant component of school quality is denoted as ↵S(i,t) . Moreover, the school quality may vary with the composition of the student body, WS(i,t)t .7 Finally, student output can vary across years, which is captured by year dummies, Dt , as well as across grades, which is captured by grade dummies, Gt . ✏ijst is an idiosyncratic error term. The error term consists of three different components:

✏ijst = where

S(i,t)t

S(i,t)t

(2)

+ ⌫J(i,t)t + eijst ,

captures shocks to the impact of school quality (e.g., through a change of the school principal),

⌫J(i,t)t captures shocks to teacher value-added (e.g., through changes in the teacher’s health), and eijst captures all other shocks to student outcomes (e.g., shocks to parental inputs). This paper quantifies the contribution of teacher quality, school quality, and assortative matching between teachers, schools, and students to the dispersion in test scores across students; to this end, I decompose the test score variance into several components. In a first step, I write the variance as:

Var(yijst ) = Var(

0 Xit | {z }

(1) student quality

+

µJ(i,t) | {z }

+

(2) t. value-added

|

{z

0 VJ(i,t)t | {z }

+

(3) t. experience

teacher quality

}

↵S(i,t) | {z }

+

(4) school effects

|

+

0 WS(i,t)t ⇢ | {z }

(5) school composition

{z

school quality + G0t ⇠ +

Dt0 | {z

}

(6) year/grade controls

✏ijst ) | {z }

}

.

(3)

(7) error term

The following stylized example of a production function illustrates the variance decomposition and its interpretation. Consider a random variable Y , here, student test scores. Assume further that test scores are solely determined by two other random variables, student quality X1 and teacher quality X2 , and that both inputs are additively separable, such that Y = X1 + X2 . Then, we can decompose the test score variance as

Var(Y ) = Var(X1 + X2 ) = Var(X1 ) + Var(X2 ) + 2Cov(X1 , X2 )

(4)

The first two terms capture the variances in student and teacher quality. The last term, 2Cov(X1 , X2 ), captures the amount of sorting and its influence on the outcome. For example, if Cov(X1 , X2 ) = 0, then teacher and student inputs are independent of each other, and sorting does not contribute to differences 7 Here,

I consider the fraction of students with free/reduced-price lunch and the racial composition.

WORKING PAPER SERIES N.99 - AUGUST 2017

10

in test scores across students. One way to achieve a zero covariance is, for example, to assign students randomly to teachers. Notice, however, that random assignment is not a necessary condition for a zero covariance. For example, if teachers and students sort along dimensions that are independent of their quality, the sorting is non-random, but still does not contribute to differences in test scores, at least not in this simple model. If Cov(X1 , X2 ) > 0, then positive assortative matching exists, and sorting exacerbates the test score differences across students. This is for example the case if better teachers are systematically assigned to better schools. If Cov(X1 , X2 ) < 0, then negative assortative matching exists, and sorting reduces the gap between high- and low-quality students. This is for example be the case if weak students are systematically assigned to high-quality teachers. In Model 1, the sorting processes are more complex. In particular, this paper considers sorting along d = 5 dimensions (see Equation 3): (1) student quality, (2) teacher value-added (time constant), (3) teacher experience (time-varying), (4) school effects (time-constant), and (5) school composition. The decomposition thus includes d = 5 variance terms and

d(d 1) 2

= 10 covariance terms. The appendix contains a full

description of all covariance terms under consideration (Table A.1). In particular, this paper investigates the sorting of better teachers to better schools and students; for example, the following term captures the sorting of teachers with high value-added to high-achieving students:

0 Cov(µJ(i,t) , Xit )

(5)

Moreover, the decomposition is informative about the sorting of students to schools. For example, the following term captures the sorting of high-achieving students to high-quality schools:

0 Cov(↵S(i,t) , Xit )

(6)

After decomposing the variance into its parts, one can compare the different sorting processes and express the variance contributions as a fraction of the total variance. For example, the portion of the variance in test scores that can be explained by the sorting of better students to better classrooms is: 0 Cov(µJ(i,t) , Xit ) Var(yijst )

(7)

Thus, the model is informative about the sorting that is present in the given education setting. The analysis can give researchers and policy makers guidance on which sources of sorting are most important. For example, are school and teacher resources relatively balanced across students? Moreover, the model allows to compare these patterns across time and space. This paper considers heterogeneity in the sorting

WORKING PAPER SERIES N.99 - AUGUST 2017

11

patterns across different geographical areas (e.g., urban and rural areas).

3.2

Identification

The above model is a version of the worker-firm sorting model by Abowd et al. (1999) (also abbreviated as AKM model). In their model, the authors study the sorting of workers to firms/establishments along unobservable dimensions. This is similar to the setting in this paper, where the schools are the firms, and the teachers are the workers. The AKM model and Model 1 differ mainly as Model 1 measures the outcome at the student and not at the worker level. The sorting of students to schools and teachers adds an additional layer to the model. In order to discuss the assumptions for the model to be identified, I follow Abowd et al. (1999) and Card et al. (2013) and rewrite the model in matrix notation. N ⇤ denotes the number of student-year observations.8 The assignment can be represented by introducing J teacher indicator variables, and S school indicator variables. These indicator variables are set to 1 if the student is assigned to the respective teacher/school, and 0 otherwise. Each student can only be assigned to one school and teacher in a given year. Thus, Model 1 is represented as:

y=

X |{z}

student quality

+ Hµ + V + F ↵ + W ⇢ + | {z } | {z } teacher quality

school quality

D + G⇠ | {z }

+✏,

(8)

year and grade dummies

where y is an N ⇤ ⇥ 1 vector of test scores, H ⌘ [h1 , ..., hJ ] is an N ⇤ ⇥ J matrix of teacher indicators, and F ⌘ [f 1 , ..., f S ] is an N ⇤ ⇥ S matrix of school indicators. Moreover, the model contains a matrix of studentlevel controls X ⌘ [x1 , ..., xK ], a matrix of indicators for teachers’ experience levels V ⌘ [v 1 , ..., v L ],9 a matrix of school composition variables W ⌘ [w1 , ..., xM ], a matrix of time dummies, D ⌘ [d1 , ..., dT ], and a matrix of grade dummies G ⌘ [g 1 , ..., g P ]. The matrices for the dummy variables are always defined up to one reference category. A first set of assumptions ensures that teacher and school effects can be separately identified. First, schools must be observed multiple times during the sample period, and at least some of the teachers must be observed in different schools. This implies that some of the teachers must switch schools. To illustrate the importance of this assumption, suppose that each school hires a fixed set of teachers in the first time period and remains with these teachers throughout the whole sample period. Then, the school effect would be completely absorbed by the worker effects. Second, the identification requires a set of schools that is connected through teacher switches during the 8 Notice

that each student is present in the data set for at most three years. levels are broken up into bins so that they can be identified separately from year fixed effects.

9 Experience

WORKING PAPER SERIES N.99 - AUGUST 2017

12

sample period. This is because the school effects can only be interpreted relative to a reference school. In a connected set of schools, the school effects are identified up to one reference school. The number of reference schools equals the number of connected sets; if several connected sets are pooled in the analysis, one cannot directly compare the school effects without further assumptions. Given the seven matrices on the right hand side of Equation 8, one can express the identifying assumptions in terms of eight orthogonality conditions:

0

E[xk ✏] = 0 0

E[f s ✏] = 0 8s,

8k, 0

0

E[hj ✏] = 0 8j,

E[wm ✏] = 0

8m,

0

(9)

E[v l ✏] = 0 8l,

0

E[dt ✏] = 0 8t,

0

E[g p ✏] = 0

8p

Three of the orthogonality conditions concern the set of student background characteristics as well as time and grade dummies.

0

E[xk ✏] = 0

8k,

0

E[dt ✏] = 0

8t,

0

E[g p ✏] = 0 8p

(10)

As is standard in the literature, I assume that these variables are pre-determined. Two further assumptions consider teacher assignment:

0

8j,

(11)

E[v l ✏] = 0 8l

(12)

E[hj ✏] = 0 0

As discussed and proven by Card et al. (2013) in the worker-firm context, a sufficient condition for Equation 11 to hold is that teacher assignments are independent of contemporaneous, past, and future shocks to student achievement, teacher quality, and school value-added, conditional on all other variables in the model (strict exogeneity condition). In the formulation of Model 1, this implies that

P [J(i, t)

=

j|✏, ↵S(i,t) , µJ(i,t) , Xit , VJ(i,t)t) , WS(i,t)t , Dt , Gt ]

= P [J(i, t)

=

j|↵S(i,t) , µJ(i,t) , Xit , VJ(i,t)t) , WS(i,t)t , Dt , Gt ] 8 i, j, t, s.

WORKING PAPER SERIES N.99 - AUGUST 2017

(13)

13

Thus, the probability of being assigned to a certain teacher is assumed to be independent of contemporaneous, past, or future shocks to teacher quality, school quality, or any other idiosyncratic shocks to the outcome. This assumption is not violated if better students are matched with better teachers. As long as the quality of students is sufficiently captured (e.g., through prior test scores and other socio-demographic characteristics), sorting does not bias the model estimates. There are at least four further concerns about the strict exogeneity of teacher assignment (Equation 13): First, this assumption is violated if teachers sort into schools based on anticipated shocks to school inputs. For example, imagine that a teacher switches to a school that he anticipates to become better after the switch (for reasons other than the quality of the other teachers at the school). In this case, we would attribute the quality increase to the teacher fixed effect, rather than to the school fixed effect, so that the teacher fixed effect will be upward biased. Second, the assumption is violated if teacher effort displays an “Ashenfelter” dip (or spike), i.e. if teachers slack off once they know that they will change their workplace, or they spend disproportionately high effort once they start working in a new school. If this is the case, then one would wrongly attribute the adjustments in effort to the quality of the school. This would bias some of the school fixed effects upwards, and some of them downwards. If this bias is systematically related to actual school quality (e.g., teachers tend to increase their effort more strongly when changing to a higher quality school), one would overstate the importance of school fixed effects. Jackson (2013) proposes a test for this assumption, by introducing indicator variables for the number of years before and after a teacher changes his job. Using this model, he documents that this problem is negligible in NC elementary schools (see p. 1102 ff). Third, the assumption can be violated if there is dynamic tracking or sorting of students. For example, teachers who are going to leave a school may be systematically assigned to worse students, and teachers who are new to a school may be systematically assigned to better students. In this case, one would attribute the effect of student composition to the permanent teacher effect, which could introduce a bias into the teacher effect estimates. Jackson (2013) tests this assumption by predicting teacher moves (whether a teacher is outgoing or incoming) based on student test scores. He finds no clear evidence that dynamic tracking biases teacher and school fixed effects in the North Carolina setting. Fourth, the assumption may be violated if match effects or “complementarities” between teachers and schools occur. Specifically, teachers who switch may be more productive in their new school (“destination school”), compared to their old school (“origin school”), as they learn about their school-specific productivity over time. If match effects exist, one can no longer separately identify the school fixed effects from the teacher effects. Jackson (2013) argues that in the North Carolina data match effects may exists. I argue however that the additively separable representation is still suitable to measure the contribution of teacher WORKING PAPER SERIES N.99 - AUGUST 2017

14

sorting. In an event study (Section 5.2) I show that the test score gains of a teacher who switches from a low- to a high-performing school are about as large as the test score losses of a teacher who switches from a high- to a low-performing school. If switching was associated with efficiency gains because of improvements in match-quality, by contrast, the losses for the high-to-low switchers would be smaller than the gains for the low-to-high switchers. Thus, the event study provides suggestive evidence that sorting on productivity gains likely does not invalidate the analysis. Once the strict exogeneity of teacher assignment is established, it is natural to assume that the experience level of the assigned teacher is pre-determined, so that Equation 12 holds. Finally, the last two orthogonality conditions relate to the assignment of students to schools:

0

E[f s ✏] = 0 8s, 0

E[wm ✏] = 0 8m

(14) (15)

In order for Equation 14 to hold, school assignment has to fulfill the following strict exogeneity condition:

P [S(i, t)

=

s|✏, ↵S(i,t) , µJ(i,t) , Xit , VJ(i,t)t , WS(i,t)t , Dt , Gt ]

= P [S(i, t)

=

s|↵S(i,t) , µJ(i,t) , Xit , VJ(i,t)t) , WS(i,t)t , Dt , Gt ] 8 i, j, s, t

(16)

This assumption states that the probability distribution over school choices for a student in a given year is independent of past, present, or future shocks to teacher value-added, school-value-added, or other idiosyncratic shocks to student outcomes, conditional on all other variables included in the model. The assumption of exogeneity of school choice is, for example, violated if students sort into schools based on anticipated shocks to school inputs (such as change of a principal, or changes in school infrastructure), or schools adjust their resources in response to the composition of the student body. If this was the case, then test score growth would be attributed to the school fixed effect, when in reality it is be driven by unobserved components of the composition of the student body (see also the discussion by Mansfield, 2015). Model 1 includes a set of control variables at the student (classroom) as well as school level. Thus, I control for the most important observable dimensions of student sorting, in particular, sorting on ability (prior test scores) and socio-economic status. The assumption is thus satisfied as long as sorting does not occur on unobservable dimensions, which are not reflected in prior test scores (such as motivation).10 10 A

detailed discussion on potential violations of such an exogenous mobility assumption is also provided by Kramarz et al. (2015).

WORKING PAPER SERIES N.99 - AUGUST 2017

15

Once the strict exogeneity of school assignment is established, one can assume that the student composition in terms of observable characteristics is also pre-determined, so that Equation 15 holds.

3.3

Estimation

The variation of interest that the model exploits—i.e., the variation in school and teacher quality—occurs at the classroom level. Each student has exactly one classroom teacher in a given year. As student characteristics enter into the model in a linear and additively separable way, no variation and information is lost by aggregating the student characteristics and outcomes at the classroom level.11 Estimating the model at the classroom level simplifies and speeds up the computation considerably, as it reduces the dimension of the matrix that needs to be inverted by a factor of about 20 (the average class size is 21). Thus, I estimate the following classroom-level model:

0

0 y cjst = ↵s + µj + X ct + Vjt0 + Wst ⇢ + Dt0 + G0t ⇠ + ✏cjst ,

(17)

where c denotes the classroom. The outcome under study, y cjst are the average math and reading test scores of students in classroom c, which is taught by teacher j in school s at time t. The specification includes student control variables at the classroom level (average grade-3 pretest scores in math and reading, their square, fraction of students with missing pretest scores, fraction female, fraction with free or reducedprice lunch, fraction with missing information on free or reduced-price lunch, fraction white students, fraction with parents whose highest degree a the high school degree or less, fraction with missing information on parental education, average age, fraction with limited English proficiency, fraction gifted students, fraction of learning disadvantaged students), time-varying teacher characteristics (experience, categorized as: no experience, 1-2 years of experience, 3-5 years of experience, 6-11 years of experience; 12 and more years of experience is the reference category), time-varying school characteristics (fraction of students with free or reduced-price lunch, fraction white students), as well as class size, year dummies, and grade dummies. I estimate the model using OLS and compute the variance decomposition based on the estimated fixed effects and coefficients. The estimation is restricted to the largest connected set of schools that is available, which covers about 85% of schools. In order to extract the largest connected set, I use the bgl toolbox in matlab. 11 The only difference between the estimation at the classroom level and at the individual level is the weighting scheme. If class sizes were very unbalanced in the data, then the aggregation at the classroom level could deliver misleading conclusions on the actual impact of teacher assignment. North Carolina, however, has strict guidelines on class sizes (the maximum class size is 30), and class sizes are evenly distributed in the data. Moreover, I exclude classrooms with more than 30 and less than 10 students. In addition, I run representative specifications at the student level, and the results differ only slightly from the classroom level estimations (results not shown).

WORKING PAPER SERIES N.99 - AUGUST 2017

16

Addressing limited mobility bias The identification of both teacher and school fixed effects relies on the movement of teachers across schools. As Abowd et al. (1999) and Andrews et al. (2008) point out, the precision in the estimates of both teacher and student effects depends on the number of moves across schools. Intuitively, suppose that a teacher moves only once, and he moves across schools of the same quality. Nevertheless, the teacher’s classroom outcomes might differ across the two schools because he may draw a bad classroom in one school and a good classroom the other school. If we would base our estimates of the school quality on just this one mover, our school effects for these two schools would be biased. As the number of movements per school grows, or alternatively, as the number of years that a teacher spends in each of his/her schools grows, the more precisely can the school effect be recovered. Moreover, as the teacher effect for the stayers is computed based on performance after subtracting the school effect, biases in school effects may induce negative correlations between school and teacher effects. To test whether such biases are a concern given the sample sizes and moving behavior in this application, I follow two strategies: First, based on Andrews et al. (2008), I compute school fixed effects only for “high turnover schools”. In the current application, I define a high turnover school as a school with at least ten moves during the sample period. The moves can be either inmoves or outmoves, but I require that the data contain the teacher’s performance in both the origin and the destination school. I pool all remaining schools, i.e. schools with less than ten moves, into the reference category. Moreover, I assess in a Monte Carlo experiment whether this strategy would reduces biases in the estimation, compared to less restrictive definitions of high turnover schools: (1) schools with at least one mover, and (2) schools with at least five movers. In addition, I present results for a restricted sample that contains only the teachers who move at least once during the sample period. Monte Carlo experiment

I use a Monte Carlo experiment to assess whether the estimation recovers

the true variances and covariances of teacher and school fixed effects, given the moving patterns in the data. As suggested by Abowd et al. (2004), I use an experiment that preserves the moving behavior that is observed in the data. I proceed as follows: First, I run Model 17 and save all coefficients on the observed characteristics. I use these to construct a fitted outcome net of school and teacher effects. Second, I remove the actual outcome (as observed in the data) from the data. Third, I randomly and independently draw teacher and school effects. The draws come from distributions which resemble the distributions of teacher and school effects as estimated in the first step (see the table notes of Table A.2 for details). Similarly, I draw idiosyncratic error terms (see the table notes of Table A.2 for details). Fourth, I construct a simulated outcome, based on the fitted outcome from the first step, the simulated teacher and school effects, and the simulated error terms. I run Model 17 again, but now substitute the actual outcome with

WORKING PAPER SERIES N.99 - AUGUST 2017

17

the simulated outcome. Finally, I compute the variance-covariance matrix of the teacher and school effects from the regression outcome. I repeat the procedure 100 times, and for each definition of high turnover schools (more than one mover, mover than five movers, more than 10 movers). Moreover, as suggested by Andrews et al. (2008), restricting the sample to movers only may reduce the bias in the person fixed effects. Therefore, I also compute the results separately for those teachers who move at least once in the sample period. Table A.2 presents the results of the Monte Carlo experiment. Column (1) presents the median values of the variances in teacher and school effects as well as the median of their covariances based on the distribution of 100 “true” (i.e. simulated) teacher and school effects. Column (2) presents the medians from running Model 17 with the simulated outcome as the dependent variable. Column (3) shows the median of the differences in the variances and covariances from the 100 replications. Column (6) reports the mean relative bias. The three different panels show the results for the different definitions of high turnover schools (Panels I-III) as well as for movers only (Panel IV). In summary, the estimation generates inflated variances of both teacher and school effects as well as negative biases in the covariance between teacher and school effects. Considering only high turnover schools (10 or more moves) reduces the biases, but does not fully remove them. For example, the variance in teacher effects is upward biased by 39 percent in panel I, but still upward biased by 33 percent in panel III. Restricting the sample to the movers only reduces the bias in the teacher effect to 18 percent. Similarly, the variance in school effects is upward biased by 39 percent in panel I, and still upward biased by 17 percent in panel III. The bias in the covariances appears small overall, but can be reduced by implementing a restrictive definition of high turnover schools.

4 Data Data sets and variables I use administrative records for the universe of school children in North Carolina elementary schools (grades 3-5) for the years 1997-2011. The data is provided under a restricted use agreement by the North Carolina Education Research Data Center at Duke University. Based on randomized identifiers, one can link information on students, teachers, and schools, and track students, teachers, and schools over time. In constructing the data set, I closely follow Jackson (2013) and Rothstein (2015). As the outcome, I use students’ end-of-grade test scores in mathematics and reading. These test scores are based on state wide standardized tests. I standardize the test scores at the year-by-grade level. In addition to test scores, the data contain student background characteristics (age, ethnicity, parental education, eligibility for free or reduced-price lunch) and information on student preparedness (classified as gifted stuWORKING PAPER SERIES N.99 - AUGUST 2017

18

dent, classified as academically disadvantaged, limited English proficiency). I extract prior-year test scores from end-of-grade test files, and enrich this information, where missing, through the “masterbuild” files. These files contain information that schools report to the state. In particular, the masterbuild file contains pretest scores for grade-3-students, which are conducted before the students enter the elementary school level.12 The end-of-grade files contain an identifier for the test proctor, which in most cases identifies the classroom teacher. In order to exclude those proctors who were not the classroom teachers, I follow the procedure as suggested by, for example, Rothstein (2015). This procedure uses the personnel file, which contains information on a teacher’s grade level as well as whether the teacher taught a self-contained classroom in a given year. I exclude those exams supervised by a teacher who was not listed as teaching a self-contained classroom in grades 3-5 in the given year. Based on this restriction, about 75% of teachers supervised their own classrooms. I exclude those 25% of classrooms who had proctors that were different from their classroom teacher. Using the teacher identifiers, I add teachers’ observable characteristics, based on salary files. These files contain information on teacher qualifications (degree, the school and state where the degree was obtained, whether the teacher has a license, whether the teacher is certified) as well as a teacher’s experience. Panel balance The final dataset for estimation contains about 2.6 million student-year observations, 127,000 teacher-year observations (34,000 teachers), and 1440 schools over a period of 15 years. The sample is not completely balanced with respect to teachers and schools (see Figure A.1). In particular, teachers enter and exit the data set during the time period. Each teacher is observed in the sample for 3.7 years on average. About 30 percent of teachers are only observed in one year. The data do not contain any information on the reasons why teachers drop out or are missing in certain years.13 By contrast, the panel of schools is rather balanced. On average, each school remains in the sample for about 11 years. About 40 percent of schools participate in the sample for the entire period. The identification of both teacher and school fixed effects relies on the movement of teachers across schools, which is illustrated in Figure A.2. Intuitively, the more frequently a teacher moves, the more accurately can his fixed effects be recovered, independently of the school where he teaches. About 20 percent of teachers switch schools during the sample period. Conditional on moving, teachers move on average 1.19 times. About 80 percent of the movers switch schools only once. 12 Grade 3 pretests are missing in 2006 for math, in 2008 for reading, and in 2009-2011 for both math and reading. I therefore exclude grade 3 students from the analysis in 2006 and 2008-2011. In addition, some of the background variables may be reported in the end-of-grade file in some years, and in the masterbuild file in other years. For detailed information, please contact the author. 13 The reasons may rank from both professional reasons (e.g., changes to private schools, obtaining a degree), but also teachers may move to a different state or take a leave. This is especially likely as 95% of the teachers are women.

WORKING PAPER SERIES N.99 - AUGUST 2017

19

Descriptive statistics

Table A.3 describes the composition of the student body in terms of student back-

ground characteristics. The majority of the students comes from backgrounds with low levels of parental education and low socioeconomic status. 11 percent of the students’ parents are high school dropouts, and 46 percent of parents obtained a high school degree, but no further education. With 47 percent of students who are eligible for free or reduced-price lunch, the socio-economic status of the students is well below the federal average during the sample period.14 Consequently, the given setting provides a context where parents’ ability to provide educational inputs and investments appears limited.

5 Results 5.1

Descriptive evidence of variance contributions

Table A.4 provides descriptive evidence on the relative contribution of within-classroom variation, betweenclassroom-within-school variation, and variation across schools to the test scores of students. The table presents a decomposition of the unconditional variances in test scores into these components. The largest part of the variation in test scores, is within classrooms (77 percent in math, and 81 percent in reading); by contrast, only 12 percent of the variation in math test scores and 10 percent of the variation in reading test scores is across classrooms, and 10 percent of the variation in both math and reading test scores is across schools. The possible sources for variation across schools include variations in teacher quality, variations in school inputs other than teacher quality, and student sorting across schools. The contributions of the different variance components can differ across geographic locations. For example, school choice may be restricted for students in areas where schools are spread out, such as rural areas. In rural areas, choosing a certain school may cause high costs of commuting or moving. Indeed, as Table A.4 shows, the contribution of the between-school variance in math test scores is highest in large cities (14 percent in both math and reading), and lowest in rural areas (8 percent in math an 7 percent in reading). By contrast, there are only small differences across geographic areas in the contribution of variation across classrooms within the same school to the overall variation in test scores.

5.2

Event study

I provide descriptive evidence for the impacts of both teacher quality and school quality on student test scores, using an event study based on teachers who switch schools. I investigate graphically whether the test scores in a teachers’ classroom change as he/she switches between schools of different quality. I 14 See

https://nces.ed.gov/programs/digest/d12/tables/dt12_046.asp.

WORKING PAPER SERIES N.99 - AUGUST 2017

20

define school quality based on the average test score gains of all other teachers in the same school. If a teacher’s test scores immediately shift toward his/her colleagues’ average test scores, the permanent school effect appears important. By contrast, if a teacher’s test scores remain stable regardless of his/her colleagues’ test scores, a strong permanent teacher component seems present. In addition, if the losses from switching from a high- to a low-quality school are about as big as the gains from switching from a lowto a high-quality school, we can assume that school and teacher effects are additively separable. To conduct such an event study, a sufficient number of teachers has to move across schools of different quality in the sample period. Table A.5 therefore reports the frequencies of switchers between schools of different quality categories. To define quality categories, I rank schools according to the average test scores of their students in a given year, excluding the teacher’s own classroom (leave-own-out), and group the schools into four quartiles of school quality in each year. As Table A.5 shows, teachers move in all directions, both up and down. Differences in moving behavior between teachers from origin schools of different quality exist, but are rather moderate. Comparing the switching behavior based on math test scores, 40 percent of teachers from a low-quality (first quartile) school switch to another low-quality school; and 11 percent of teachers from a high-quality (fourth quartile) school switch to a low-quality school. Similarly, 15 percent of teachers from a low-quality school switch to a high-quality school; and 46 percent of teachers from a highquality school switch to another high-quality school. This pattern suggests that teachers’ career choices are not strongly path dependent based on past school quality. There is, however, a tendency of teachers to move up the school quality ladder rather than to move down. The pattern for math and reading resemble each other. Figure A.3 displays the results of the event study for teacher from high-quality (fourth quartile) and lowquality (first quartile) schools. The results are as follows: First, all four panels suggest that school effects are important. In all cases, teachers test scores move visibly in the direction of their colleagues’ test scores after a switch. Second, a permanent component of teacher effects exists, as teachers’ test scores on average do not fully converge to their colleagues’ test scores. Third, gains and losses seem symmetrically distributed, i.e. gains from moves from low- to high-quality schools are about as large as losses from moves from high- to low-quality schools. Fourth, the figure supports neither match effects nor “Ashenfelter’s dip”-type behavior: Switchers between schools of similar quality do not experience visible performance improvements. Based on these results, a specification with additive separable teacher and school effects, which abstracts from match effects, provides reasonable model choices.

WORKING PAPER SERIES N.99 - AUGUST 2017

21

5.3

Main results

Based on Model 17, I decompose the variance in test scores across classrooms into all possible variance and covariance components. In total, I consider d = 5 categories: student quality (modeled as a background 0

index based on lagged test scores and socio-demographic characteristics, i.e. X ct ), teacher value-added (measured as the teacher fixed effects, µj from Model 17), teacher experience, school quality (measured as the school fixed effects, ↵s from Model 17), and school composition, measured as the average test scores at the school level and the average socio-demographic composition of the school (fraction free/reduced-price lunch, fraction white students). Considering all five elements leads to d = 5 variances as well as

d(d 1) 2

= 10

covariances. While the variances capture the main effects in the model, the covariances capture the sorting of teachers to students, teachers to schools, students to schools, and students to teachers.15 Tables A.7 and A.8 present the results of the variance decomposition, for both math (Table A.7) and reading (Table A.8), and for different specifications and samples. The baseline specification (columns (1)(2)) includes the sample of all teachers in the high-turnover schools (more than 10 movers). Columns (3)-(4) present the results for the movers only. In all specifications, I divide the variances and covariances in three groups: The main effects (panel I), the sorting effects (panel II), and the remaining variances (panel III), i.e. variances and covariances of grade and year controls as well as the variance of the error term. I further subdivide the sorting into teacher sorting to schools and teachers (panel II.A) and student sorting to schools (panel II.B). The results differ only slightly between the sample over movers and the full sample; I concentrate in the following on the results for the movers, because these results seem to exhibit smaller biases. The largest contribution to the variation in test scores comes from student quality (here: classroom quality). A one-standard deviation increase in classroom quality raises performance by 0.355 standard deviations in math, and 0.351 standard deviations in reading (column (1)). Overall, student quality explains 53 percent of the variation in math test scores across classrooms, and 60 percent of the variation in reading test scores (column (2)). This result points to the persistence of education outcomes. Teacher value-added is the second most important component in the variance in test scores across classrooms. The magnitude of teacher value-added from the estimation is strikingly in line with the results from previous studies. For teacher value-added in math, I find a variance of 0.036, i.e. a standard deviation of 0.190. This implies that an increase in teacher quality by one standard deviation would raise student test scores by 0.190 standard deviations. The result is almost identical to the result found by Rothstein (2015), who finds a standard deviation in teacher value-added of 0.193 in the same setting. The resemblance of 15 Exceptions are (a) the covariance between teacher observables and unobservables, and (b) the covariance between school observables and unobservables.

WORKING PAPER SERIES N.99 - AUGUST 2017

22

the estimate is noteworthy, as Rothstein (2015) uses a different method to compute teacher value-added.16 Teacher experience, by contrast, explains only about one percent in the variance of test scores across students. Moreover, teacher experience and teacher value-added are uncorrelated. Thus, improvements of experience may improve a teacher’s performance; but in a cross-section, the relationship between a teacher’s experience and his quality of instruction is weak. Teacher value-added is with a standard deviation of 0.138 substantially lower in reading, which is in line with the literature (Rothstein, 2015), and teacher experience does not have any explanatory power for reading test scores. The unobserved component of school quality (here: “school effects”), explains a relatively small amount of the variance, compared to teacher quality, i.e., only 6.4 percent in math and 5.3 percent in reading. This is the quality of a school once teacher and student composition have been taken into account. Nevertheless, the absolute magnitude is non-negligible. Increasing school quality by one standard deviation is associated with an increase in test scores by 0.12 standard deviations in math and 0.10 standard deviations in reading. As the model already accounts for classroom composition, school effects and the composition of the whole student body are uncorrelated. Student sorting to schools is the third most important factor in explaining variation in student test scores (panel II.B of Tables A.7 and A.8), and contributes 10 percent to the variance in math test scores across classrooms, and 12 percent to the variance in reading test scores across classrooms. This indicates that better students are exposed to better schools; part of this effect may come from residential sorting. This implies that, while school effects are overall relatively small, students who are better prepared reap the benefits of better schools. Teacher sorting (panel I.A of Tables A.7 and A.8) explains in total 6 percent of the variation in math test scores across classrooms, and 5 percent of the variation in reading test scores across classrooms. Importantly, teachers with higher value-added and more experienced teachers are observed in schools and classrooms with higher quality students. By contrast, teachers do not positively sort on the unobserved components of school quality, once one accounts for student composition. If anything, the sorting of teachers to unobserved school quality is negative. The study thus confirms results from earlier studies, which show that teachers with better observed attributes (i.e. better education and longer experience) sort to schools primarily based on the composition of the student body (Lankford et al., 2002). This is also true for the unobserved component of teacher quality. 16 Rothstein (2015) uses the same method as Chetty et al. (2014a). The method accounts for drift in teacher value-added, i.e. teacher value-added may change over time. Moreover, Rothstein (2015) and Chetty et al. (2014a) estimate the model at the student level and then average across students; by contrast, I first average across classrooms and then run the model to compute valueadded. Chetty et al. (2014a) find a standard deviation of teacher value-added of 0.162. Rothstein (2015), however, raises concerns about various potential sources of biases in value-added estimates that can arise using the method by Chetty et al. (2014a). In a reply to Rothstein (2015), Chetty et al. (2015) refute this criticism. I therefore abstract from this debate.

WORKING PAPER SERIES N.99 - AUGUST 2017

23

Teacher sorting to students with better classrooms can come from two sources: First, teachers sort to schools with better quality and/or students select schools with higher quality teachers, and second, within the same school, better teachers teach those classrooms that are already better prepared. In order to disentangle these two sources of teacher sorting, Table A.12 further decomposes the covariance between teacher value-added and student quality into two components, sorting within schools and sorting across schools. In order to disentangle the two effects, I compute the average teacher quality for each school, as well as each teacher’s deviation from the average teacher quality in the school that he teaches in a certain year. I then compute two covariances, (1) the covariance between average teacher quality and classroom quality — an indicator of student and teacher sorting across schools, and (2) the covariance between a teacher’s deviation from the school average and classroom quality as an indication of teacher sorting within schools. Sorting across schools is overall more important than sorting within schools (5.4 percent versus 3.0 percent in math, and 4.4 percent versus 3.4 percent in reading), but the amount of sorting within schools is still non-negligible. Thus, in order to improve upon the inequality in test scores across schools, focusing on disparities in teacher assignments within schools can provide an effective way to address part of the performance gap.

5.4

Spatial heterogeneity

Sorting of teachers to schools and classroom as well as student sorting to schools may not be equally important across different geographic areas (large cities, mid-size cities, urban fringe, towns, rural areas). To illustrate the differences in test scores and inputs across regions, Table A.9 provides a first descriptive overview about the the means of the outcomes as well as school in quality, student quality, and teacher quality across regions. Overall performance is lowest in large cities (0.12 standard deviations below the sample average in math and 0.15 standard deviations below the sample average in reading), and highest in suburban areas (0.09 standard deviations above the sample average in reading and 0.07 standard deviations above the sample average in math). The performance difference amounts to almost 20 percent of a standard deviation on average—which is a strong indicator for inequality across regions. Furthermore, students in mid-size cites and towns perform below the sample average, and students in rural areas perform slightly above the sample average. These differences are mostly reflected in student quality and the composition of students, which is again unsurprising because of the persistence in student test scores. In particular, teacher value-added is not lower in large cities compared to all other areas, but even slightly higher. Moreover, school quality does not differ appreciably between large cities and suburbs. The experience patterns are in line with teachers moving away from large cities—experience is lowest in large

WORKING PAPER SERIES N.99 - AUGUST 2017

24

cities. Heterogeneity in sorting patterns across regions exist. As Tables A.4 shows, differences in classroom performance between schools are relatively more important in dense areas like large and mid-size cities, and relatively less important in towns and rural areas. By contrast, between-classroom sorting is comparable in magnitude across geographical areas with different densities. At least two potential mechanisms may explain why the variation in test scores across schools is larger in urban compared to rural areas: First, residential segregation may be more pronounced in cities, and thus, schools may be more segregated overall in terms of student quality. Second, teacher sorting may be stronger in urban areas, because alternative schools are close by. Tables A.10 and A.11 provide further insights into this question and show that student sorting rather than teacher sorting accounts for the large variance in test scores across schools in urban areas (panel II.B). Student sorting to schools explains 19 percent of the test score differences in math and even 21 percent of the test score differences in reading across classrooms in large cities, but only 2 percent of the test score differences in math and 0.3 of the test score differences in reading across classrooms in rural areas. Teacher sorting to students both within and across schools is relatively more important in rural areas, compared to large cities. This is despite the fact that teacher turnover is higher in large cities, compared to all other areas (see Table A.4). Both teacher sorting across schools and teacher sorting within schools are less important in large cities, compared to rural areas. As one explanation for this pattern, schools in rural areas seem to compensate for the smaller degree in school choice by allowing for more imbalances in classroom and teacher assignments within schools. Overall, the following patterns emerge when comparing all five areas: Large cities do not only have the lowest average test scores, but also the the highest variance in test scores. While reassignment of teachers may reduce the inequality, student sorting across schools seems the most important problem in large cities. By contrast, in rural areas, one may consider teacher assignment policies as the most effective way to improve student test scores. The least problematic areas are suburban areas, with relatively high average test scores, and a relatively small variance. Midsize cities fall indeed in the middle between all the categories. The test scores are slightly below the state mean. Moreover, both teacher sorting to schools and classrooms and student sorting to schools seem equally important.

5.5

Simulations of counter-factual assignments

An alternative way to quantify the impact of teacher sorting is to contrast the distribution under the current sorting as observed in the data with the outcome distributions under alternative, i.e. counter-factual, teacher

WORKING PAPER SERIES N.99 - AUGUST 2017

25

assignment policies. This section considers three different policies: (1) an equitable distribution of teachers within schools, (2) an equitable distribution of teachers within schools and school districts, and (3) an equitable distribution of teachers within schools and within the whole state of North Carolina. In practice, these policies may be difficult to implement, but they provide useful benchmarks when creating feasible policy designs. To derive the outcome distribution under these counter-factual assignment schemes I implement the following simulation design. First, after estimating Model 17, I compute for every classroom the outcome net of teacher effects, i.e., I subtract the teacher effect from the observed outcome. Second, I create 100 random assignments of teachers to classrooms for each of the three different counter-factual assignment schemes (i.e., random within schools, random within school districts, and random within the state). The teachers are drawn without replacements, and the random assignments are constructed separately for each year. Third, for each scenario, I compute 100 new test score outcomes for each classroom based on the observed outcome, net of the original teacher effect, and the 100 simulated teacher effects. I then average across all 100 draws. Figure A.4 visualizes how the outcome distributions under the three different counter-factual scenarios would differ from the outcome distribution under the current assignment scheme, for both math (top panel) and reading (bottom panel). In all three panels, the grey histograms depict the outcome distribution under the counter-factual assignment scheme; the transparent histograms provide the distribution under the current sorting scheme for comparison. Under all three scenarios, the counter-factual distribution is more compressed and thus more equal than the original distribution. Moving from random assignment within schools to random assignment within districts and then within the state visibly leads to gradual improvements in equity across classrooms. Furthermore, by construction, all counter-factual policies leave the average outcome unchanged. This is because teacher and classroom inputs enter as additively separable inputs into the education production function (Model 17). Therefore, all gains for low-quality classrooms are losses for high-quality classrooms. Tables A.13 and A.14 quantify the effects. The tables report how the assignment schemes affect (1) the performance gap between a classroom at the 75th and 25th percentile of the outcome distribution (“interquartile range”, columns (3)-(5)) and (2) the performance gap between a classrom at the 90th and 10th percentile of the outcome distribution (columns (6)-(9)). These inequality measures are standard ways to report inequity in test scores in educational settings (see, for example, Lyle, 2009). With respect to the interquartile range, I find the following results (see column (5)): Random assignment within schools reduces the performance gap by 0.05 standard deviations in math and 0.03 standard deviations in reading. In relative terms, this is an improvement by 8 percent in math and 5 percent in reading, relative to the original WORKING PAPER SERIES N.99 - AUGUST 2017

26

performance gap. Allowing for random allocations within school districts leads to a reduction by 13 percent in math and 9 percent in reading (0.09 standard deviations in math, 0.05 standard deviations in reading), and allowing for random allocations within districts induces a reduction by 16 percent in math and 12 percent in reading (0.10 standard deviations in math, 0.07 standard deviations in reading). With respect to the gap between the classrooms at the 90t h and the 10t h percentile of the outcome distribution, I find results of similar relative magnitudes. In summary, the random allocations can substantially improve the test score equity across classrooms. Changing the assignment within schools achieves already more than half of the improvements that are possible based on the three scenarios under consideration. A least three limitations of such a simulation, however, should be taken into account when interpreting the results: First, the sorting effects may be smaller in practice if the estimates of teacher effects are measured with error, and if the estimate of the variance in teacher effects is upward biased. Based on the Monte Carlo results from Section 3.3, I find that the relative bias in the variance estimate amounts to about 18 percent—a non-negligible amount. Therefore, the results of the policy simulation provide an upper bound for the actual effects. Second, the simulation assumes that alternative assignment schemes do not affect the behavior of the agents in the model (teachers, school principals, students, and parents) in equilibrium. Teachers, for example, may switch schools or even leave the profession as a reaction to new policies. Moreover, principals may adjust other resources in response. This “Lucas critique” proves important in education settings, as, for example, shown by Carrell et al. (2013), who consider reassignments of students across peer groups. Model 17 does not allow for changes in equilibrium behavior. Therefore, in order to account for equilibrium effects, one has to either model the agents’ behavioral responses, exploit natural experiments that are informative on changes in behavior, or carry out field experiments. This exceeds the scope of this paper, but opens up the field for future analyses. Third, the model does not consider match effects between teachers and classrooms, which may occur if certain teachers are more or less effective, depending on the characteristics of their assigned school and classroom. In fact, gains from alternative assignments may be higher or lower if match effects are present. Moreover, a model with match effects would allow for efficiency gains, not just gains in equity. Model 17, however, abstracts from match effects to make the model tractable. Recent econometric frameworks study match effects in the AKM framework (Bonhomme et al., 2016), as well as reassignment policies in the presence of match effects (Graham et al., 2007, 2010, Graham, 2011, Graham et al., 2014, 2016). Adopting such frameworks to the current problem again provides an avenue for future research.

WORKING PAPER SERIES N.99 - AUGUST 2017

27

6 Conclusion This paper studies the teacher sorting across and within schools as a source of inequality in student test scores. I find evidence that sorting of teachers to schools as well as to classrooms substantially contributes to inequality in test score outcomes. These findings furthermore suggest that in order to reduce the inequality in student test scores, one may redistribute teachers across classrooms, both within and across schools. It is important to note, however, that redistributions based on the model studied in this paper will not induce changes in aggregate test scores. As the model does not allow for match effects between teachers and classrooms/schools, all resource-neutral allocations will preserve the aggregate outcome.

WORKING PAPER SERIES N.99 - AUGUST 2017

28

References A BOWD, J. M., F. K RAMARZ , P. L ENGERMANN , AND S. P EREZ -D UARTE (2004): “Are Good Workers Employed by Good Firms? A test of a simple assortative matching model for France and the United States,” Manuscript. A BOWD, J. M., F. K RAMARZ , AND D. N. M ARGOLIS (1999): “High wage workers and high wage firms,” Econometrica, 67, 251–333. A NDREWS , M. J., L. G ILL , T. S CHANK , AND R. U PWARD (2008): “High wage workers and low wage firms: negative assortative matching or limited mobility bias?” Journal of the Royal Statistical Society: Series A (Statistics in Society), 171, 673–697. B ACHER -H ICKS , A., T. J. K ANE , AND D. S TAIGER (2014): “Validating Teacher Effect Estimates Using Changes in Teacher Assignments in Los Angeles,” NBER Working Paper. B IFULCO, R. AND H. F. L ADD (2007): “School choice, racial segregation, and test-score gaps: Evidence from North Carolina’s charter school program,” Journal of Policy Analysis and Management, 26, 31–56. B IFULCO, R., H. F. L ADD, AND S. L. R OSS (2009): “Public school choice and integration evidence from Durham, North Carolina,” Social Science Research, 38, 71–85. B ONHOMME , S., T. L AMADON , AND E. M ANRESA (2016): “A distributional framework for matched employer employee data,” Unpublished manuscript. B OYD, D., H. L ANKFORD, S. L OEB , AND J. W YCKOFF (2013): “Analyzing the determinants of the matching of public school teachers to jobs: Disentangling the preferences of teachers and employers,” Journal of Labor Economics, 31, 83–117. C ARD, D., J. H EINING , AND P. K LINE (2013): “Workplace Heterogeneity and the Rise of West German Wage Inequality,” The Quarterly Journal of Economics, 128, 967–1015. C ARRELL , S. E., B. I. S ACERDOTE , AND J. E. W EST (2013): “From Natural Variation to Optimal Policy? The Importance of Endogenous Peer Group Formation,” Econometrica, 81, 855–882. C HETTY, R., J. F RIEDMAN , AND J. R OCKOFF (2015): “Measuring the Impacts of Teachers: Response to Rothstein (2014),” Tech. rep., CEPR Discussion Papers. C HETTY, R., J. N. F RIEDMAN , AND J. E. R OCKOFF (2014a): “Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates,” American Economic Review, 104, 2593–2632. WORKING PAPER SERIES N.99 - AUGUST 2017

29

——— (2014b): “Measuring the Impacts of Teachers II: Teacher Value-Added and Student Outcomes in Adulthood,” American Economic Review, 104, 2633–79. C LOTFELTER , C. T., H. F. L ADD, AND J. V IGDOR (2005): “Who teaches whom? Race and the distribution of novice teachers,” Economics of Education review, 24, 377–392. C LOTFELTER , C. T., H. F. L ADD, AND J. L. V IGDOR (2007): “Teacher credentials and student achievement: Longitudinal analysis with student fixed effects,” Economics of Education Review, 26, 673–682. ——— (2011): “Teacher mobility, school segregation, and pay-based policies to level the playing field,” Education, 6, 399–438. C UNHA , F., J. J. H ECKMAN , L. L OCHNER , AND D. V. M ASTEROV (2006): “Interpreting the evidence on life cycle skill formation,” Handbook of the Economics of Education, 1, 697–812. D IETERLE , S., C. M. G UARINO, M. D. R ECKASE , AND J. M. W OOLDRIDGE (2015): “How do principals assign students to teachers? Finding evidence in administrative data and the implications for value added,” Journal of Policy Analysis and Management, 34, 32–58. G RAHAM , B. S. (2011): “Econometric Methods for the Analysis of Assignment Problems in the Presence of Complementarity and Social Spillovers,” in Handbook of Social Economics, ed. by J. Benhabib, A. Bisin, and M. O. Jackson, North-Holland, vol. 1, chap. 19, 965–1052. G RAHAM , B. S., G. W. I MBENS , AND G. R IDDER (2007): “Redistributive effects of discretely valued inputs,” Unpublished manuscript. ——— (2010): “Measuring the Effects of Segregation in the Presence of Social Spillovers: A Nonparametric Approach,” Working Paper 16499, National Bureau of Economic Research. ——— (2014): “Complementarity and aggregate implications of assortative matching: A nonparametric analysis,” Quantitative Economics, 5, 29–66. ——— (2016): “Identification and efficiency bounds for average match output under conditionally exogenous matching,” Tech. rep., USC-INET Research Paper No. 16-12. J ACKSON , C. K. (2009): “Student demographics, teacher sorting, and teacher quality: Evidence from the end of school desegregation,” Journal of Labor Economics, 27, 213–256. ——— (2013): “Match Quality, Worker Productivity, and Worker Mobility: Direct Evidence from Teachers,” Review of Economics and Statistics, 95, 1096–1116.

WORKING PAPER SERIES N.99 - AUGUST 2017

30

K ANE , T HOMAS , J., D. F. M C C AFFREY, T. M ILLER , AND D. O. S TAIGER (2013): “Have we identified effective teachers? Validating measures of effect teaching using random assignment,” Tech. rep., MET Project Research Paper. K RAMARZ , F., S. M ACHIN , AND A. O UAZAD (2015): “Using compulsory mobility to identify school quality and peer effects,” Oxford Bulletin of Economics and Statistics, 77, 566–587. L ANKFORD, H., S. L OEB , AND J. W YCKOFF (2002): “Teacher sorting and the plight of urban schools: A descriptive analysis,” Educational Evaluation and Policy Analysis, 24, 37–62. L OPES DE M ELO, R. (2016): “Firm Wage Differentials and Labor Market Sorting: Reconciling Theory and Evidence,” Journal of Political Economy, forthcoming. LYLE , D. S. (2009): “The Effects of Peer Group Heterogeneity on the Production of Human Capital at West Point,” American Economic Journal: Applied Economics, 1, 69–84. M ANSFIELD, R. K. (2015): “Teacher Quality and Student Inequality,” Journal of Labor Economics, 33, 751– 788. M ORTENSEN , D. T. (1986): “Job search and labor market analysis,” Handbook of labor economics, 2, 849– 919. M ORTENSEN , D. T. AND C. A. P ISSARIDES (1999): “New developments in models of search in the labor market,” Handbook of labor economics, 3, 2567–2627. R EARDON , S. F., D. K ALOGRIDES , AND K. S HORES (2016): “The Geography of Racial/Ethnic Test Score Gaps,” Working Paper 16-10, CEPA. R IVKIN , S. G., E. A. H ANUSHEK , AND J. F. K AIN (2005): “Teachers, schools, and academic achievement,” Econometrica, 417–458. R OCKOFF , J. E. (2004): “The impact of individual teachers on student achievement: Evidence from panel data,” American Economic Review, 247–252. R OTHSTEIN , J. (2015): “Revisiting the Impact of Teachers,” Manuscript. S ASS , T. R., J. H ANNAWAY, Z. X U, D. N. F IGLIO, AND L. F ENG (2012): “Value added of teachers in highpoverty schools and lower poverty schools,” Journal of Urban Economics, 72, 104–122. S HIMER , R. AND L. S MITH (2000): “Assortative matching and search,” Econometrica, 68, 343–369.

WORKING PAPER SERIES N.99 - AUGUST 2017

31

Figure A.1: Panel balance

0

.1

Fraction .2

.3

Distribution of the number of years per teacher

0

5 10 Number of years a teacher is observed in the sample

15

number of teachers = 34349, avg number of years in the sample = 3.69 teacher−year observations = 126696

0

.1

Fraction .2

.3

.4

Distribution of the number of years per school

0

5 10 Number of years a school is observed in the sample

15

number of schools = 1440, avg number of years in the sample = 11.32

The figure displays the number of years that a teacher is present in the sample (top panel) as well as the number of years that a school is present in the sample (bottom panel).

WORKING PAPER SERIES N.99 - AUGUST 2017

32

Figure A.2: Moving frequencies

0

.2

Fraction .4

.6

.8

Distribution of number of moves per teacher conditional on moving

1

2 Number of moves

3 and more

number of teachers = 34349, fraction movers = .21 avg number of moves per teacher = .23, avg conditional on moving = 1.19

0

.02

Fraction .04

.06

.08

Distribution of the number moves per school

0

10

20 Number of total moves

30

40

number of schools = 1433, avg number of moves per school = 10.37 fraction with: at least 1 move = .96, at least 5 moves = .81, at least 10 moves = .48

The figure shows the number of moves per teacher, conditional on moving at least once (top panel) and the number of moves per school (bottom panel, inmoves and outmoves combined) for the estimation sample.

WORKING PAPER SERIES N.99 - AUGUST 2017

33

Figure A.3: Event study

Math test scores before and after a move, conditional on school quality

0 −.5

−.5

0

.5

Switchers to 2nd−quartile schools

.5

Switchers to 1st−quartile schools

Before

After

Before

0 −.5

−.5

0

.5

Switchers to 4th−quartile schools

.5

Switchers to 3rd−quartile schools

After

Before

After

Before

from 1st quartile school

After

from 4th quartile school

Reading test scores before and after a move, conditional on school quality

0 −.5

−.5

0

.5

Switchers to 2nd−quartile schools

.5

Switchers to 1st−quartile schools

Before

After

Before

Before

0 −.5

−.5

0

.5

Switchers to 4th−quartile schools

.5

Switchers to 3rd−quartile schools

After

After

from 1st quartile school

Before

After

from 4th quartile school

The figure shows changes in test scores in math (top panel) and reading (bottom panel) in a teacher’s classroom, for teachers who switch schools. The figure shows the test score gains at two points in time: in the origin school in the year before the move (“before”), and in the destination school in the year of/after the move (“after”). The sample is restricted to teachers from the top 25% and the bottom 25% of the school quality distribution (n=2,575). School quality is defined by ranking schools according to students’ average test scores in a given year, and dividing schools into four different quartiles of test scores. The school quality variable is based on performance in all classrooms except the teacher’s own classroom (leave-own-out).

WORKING PAPER SERIES N.99 - AUGUST 2017

34

Figure A.4: Simulation of counterfactual assignments

Simulation of math test scores

0

.02

Fraction .04 .06

.08

.1

(1) Random allocation within schools

−2

−1

0

1

2

.08 Fraction .04 .06 .02 0

0

.02

Fraction .04 .06

.08

.1

(3) Random allocation within state

.1

(2) Random allocation within districts

−2

−1

0

1

2

−2

−1

Random allocation

0

1

2

Original

Simulation of reading test scores

0

Fraction .02 .04 .06 .08

.1

(1) Random allocation within schools

−2

−1

0

1

2

Fraction .02 .04 .06 .08 0

0

Fraction .02 .04 .06 .08

.1

(3) Random allocation within state

.1

(2) Random allocation within districts

−2

−1

0

1

2

−2

−1

Random allocation

0

1

2

Original

This figure shows the distribution of test scores in math (top panel) and reading (bottom panel) across classrooms under different policy scenarios. I simulate the impact of different counter-factual teacher assignments and consider three policies: random allocation of teachers within schools (panel I), random allocation of teachers within school districts (panel II), and random allocation of teachers within the whole state of North Carolina (panel III). The grey histograms represents the simulated test score distributions, and the transparent histograms represent the original distribution for comparison. The simulations are based on 100 random teacher draws (without replacement), and the results are averaged across all random draws. For details on the procedure, see also Section 5.5.

WORKING PAPER SERIES N.99 - AUGUST 2017

35

Table A.1: Variance components

(I) Variances (main effects) Var(student quality) Var(teacher value-added) Var(teacher experience) 2*Cov(experience, teacher value-added) Var(school effects) Var(school composition) 2*Cov(school effects, school composition)

0 Var(Xit ) Var(µJ(i,t) ) 0 Var(VJ(i,t)t ) 0 2Cov(µJ(i,t) , VJ(i,t)t ) Var(↵S(i,t) ) Var(WS(i,t)t ) 2Cov(↵S(i,t) , WS(i,t)t )

(II) Covariances (sorting) (II.A) Teacher sorting to schools and students 2*Cov(teacher value-added, student quality) 2*Cov(teacher experience, student quality) 2*Cov(teacher value-added, school effects) 2*Cov(teacher experience, school effects) 2*Cov(teacher value-added, school composition) 2*Cov(t. experience, school composition)

0 2Cov(µJ(i,t) , Xit ) 0 0 2Cov(VJ(i,t)t , Xit ) 2Cov(µJ(i,t) , ↵S(i,t) ) 0 2Cov(VJ(i,t)t , ↵S(i,t) 2Cov(µJ(i,t) , WS(i,t)t ) 0 2Cov(VJ(i,t)t , WS(i,t)t )

(II.B) Student sorting to schools 2*Cov(school effects, student quality) 2*Cov(school composition, student quality)

0 2Cov(↵S(i,t) , Xit ) 0 2Cov(WS(i,t)t , Xit )

This table provides an overview over the variance components of interest, as derived from a variance decomposition of Model 1. For details, see Section 3.

WORKING PAPER SERIES N.99 - AUGUST 2017

36

Table A.2: Monte Carlo experiment

(1)

(2)

(3)

(4)

Simulation median

Monte Carlo median

Median of difference

Mean relative bias

0.040 0.010 0.000

0.056 0.014 -0.007

0.015 0.004 -0.007

39% 39% -

0.040 0.010 0.000

0.054 0.012 -0.004

0.014 0.003 -0.004

36% 26% -

0.040 0.008 0.000

0.053 0.010 -0.002

0.013 0.002 -0.002

33% 17% -

0.040 0.009 0.000

0.047 0.011 -0.002

0.007 0.002 -0.002

18% 18% -

(I) All schools Var(teacher effect) Var(school effect) 2Cov(teacher, school) (II) At least 5 moves Var(teacher effect) Var(school effect) 2Cov(teacher, school) (III) At least 10 moves Var(teacher effect) Var(school effect) 2Cov(teacher, school) (IV) At least 10 moves, only movers Var(teacher effect) Var(school effect) 2Cov(teacher, school)

The table shows results of a Monte Carlo experiment. Panel I computes school effects for all schools; panel II computes school effects for all schools with more than 5 moves, and a joint school effect for the remaining schools with less than 5 moves; panel III computes school effects for all school with more than 10 moves, and a joint school effect for the remaining schools with less than 10 moves; panel IV presents the results for movers only. The data is is simulated as follows: We assume the teacher and school fixed effects to be normally and independently distributed with mean 0 and standard deviations j = 0.2 for the teachers and s = 0.1 for the schools. The error is defined as a composite error, which consists of shocks to teacher value-added, shocks to school value-added, and idiosyncratic shocks to classroom performance, i.e. ✏ct = st + ⌫jt + ecjst , where st ⇠ N (0, 0.01), ⌫jt ⇠ N (0, 0.02), and ecjst ⇠ N (0, 0.16). The simulation is based on 100 independent draws from these distributions. Row (1) shows the median of the simulated distribution in the data, row (2) shows the median of the distribution of effects as recovered based on Model 1 in the Monte Carlo experiment, row (3) shows the median of the distribution of differences between the estimate from the Monte Carlo experiment and the simulated effect, and row (4) shows the mean relative bias, where the mean is computed across all 100 draws.

WORKING PAPER SERIES N.99 - AUGUST 2017

37

Table A.3: Student background and outcomes

Female Age Ethnicity white black hispanic other Parental education no high school high school up to community college trade or business school 4-year college graduate school Free/reduced lunch eligible Academically gifted gifted Academically disadvantaged combined reading writing math Limited english proficiency yes Missing baseline scores math reading Outcomes reading score math score

Mean

SD

Min

Max

Obs

0.5 10.37

0.5 1

0 7

1 17

2,619,339 2,619,548

0.6 0.28 0.07 0.06

0.49 0.45 0.25 0.23

0 0 0 0

1 1 1 1

2,619,548 2,619,548 2,619,548 2,619,548

0.11 0.47 0.05 0.11 0.21 0.05

0.31 0.5 0.22 0.31 0.41 0.21

0 0 0 0 0 0

1 1 1 1 1 1

1,944,867 1,944,867 1,944,867 1,944,867 1,944,867 1,944,867

0.46

0.5

0

1

2,352,214

0.13

0.33

0

1

2,613,106

0.06 0.04 0.04 0.02

0.23 0.2 0.19 0.14

0 0 0 0

1 1 1 1

2,615,731 2,619,548 2,619,548 2,619,548

0.04

0.2

0

1

2,610,130

0.21 0.19

0.4 0.4

0 0

1 1

2,619,548 2,619,548

0.02 0.03

0.99 0.99

-4.21 -4.33

3.14 3.66

2,619,548 2,619,548

The table presents descriptive statistics for students in the estimation sample. Information on free/reduced-price lunch is missing in 1997-1998. Information on parental education is missing in 2007-2011.

WORKING PAPER SERIES N.99 - AUGUST 2017

38

Table A.4: Decompositions of unconditional variances across schools, across classrooms, and within classrooms

(1) All

(I) Math Avg. testscores Total test score variance Between-school variance Between-classr.-within-school Within-classroom variance Contribution to total variance Between-school variance Between-classr.-within-school Within-classroom variance

(2) City large

(3) midsize

(4) Urban fringe

(5) Town

(6) Rural

0.03 0.98 0.10 0.12 0.75

-0.04 1.12 0.16 0.17 0.80

0.02 1.06 0.13 0.13 0.80

0.11 0.96 0.10 0.11 0.76

-0.03 0.93 0.09 0.12 0.71

0.03 0.90 0.07 0.11 0.72

11% 12% 77%

14% 15% 71%

12% 12% 76%

10% 12% 79%

9% 13% 77%

8% 12% 80%

0.02 0.98 0.10 0.09 0.79

-0.07 1.09 0.15 0.13 0.81

0.01 1.05 0.13 0.10 0.82

0.09 0.96 0.08 0.09 0.79

-0.03 0.94 0.08 0.10 0.77

0.02 0.92 0.06 0.08 0.77

10% 10% 81%

14% 12% 74%

12% 10% 78%

9% 9% 82%

8% 10% 81%

7% 9% 84%

2,584,712 122,146 1361 90 21

203,966 9,682 83 117 21

653,633 30,906 307 101 21

663,470 30,297 281 108 22

386,934 19,058 249 77 20

676,709 32,203 441 73 21

(II) Reading Avg. testscores Total test score variance Between-school variance Between-classr.-within-school Within-classroom variance Contribution to total variance Between-school variance Between-classr.-within-school Within-classroom variance Student-year observations # classrooms # schools # classrooms per school Avg. classize

The table shows variance decompositions based on the student sample into three components, the between-school variance, the between-classroom-within-school variance, and the within-classroom variance. Column (1) shows the variances for the full sample, and columns (2)-(6) show the results for subsamples, based on the geographic location of the school. The definition of geographic areas are provided based on data from the NCERDC. As definitions of geographic areas can switch over time, I used the oldest definition that is available in the data, so that each school does not switch across categories over time. I exclude schools for which definitions of geographic areas are not available. The sample is restricted to schools with at least 10 moves during the sample period.

WORKING PAPER SERIES N.99 - AUGUST 2017

39

Table A.5: Transition matrix: movers

(I) Math

Origin school quartile

Destination school quartile 1 2 3 4 % % % %

Total %

Obs.

1 2 3 4

40% 24% 15% 11%

26% 30% 20% 16%

18% 23% 34% 27%

15% 23% 31% 46%

100% 100% 100% 100%

1,522 1,160 1,091 1,018

Total

24%

24%

25%

27%

100%

4,791

Destination school quartile 1 2 3 4 % % % %

Total %

Obs.

(II) Reading

Origin school quartile 1 2 3 4

42% 23% 13% 12%

24% 30% 27% 12%

18% 23% 31% 27%

16% 24% 30% 49%

100% 100% 100% 100%

1,557 1,182 1,000 1,052

Total

24%

24%

24%

28%

100%

4,791

The table shows transition frequencies and proportions for those teachers who switch schools at least once during the sample period. I exclude movers who are not observed in the dataset in the year before or after the move. School quality is defined by ranking schools according to students’ average test scores in math/reading in a given year, and dividing schools into four different quartiles of test score gains. The school quality variable is based on performance in all classrooms except the teacher’s own classroom (leave-own-out).

WORKING PAPER SERIES N.99 - AUGUST 2017

40

Table A.6: Results: Coefficients on observed characteristics (dependent variable: math test score)

Reading baseline score Math baseline score Reading baseline squared Math baseline squared Reading baseline missing Math baseline missing Female Free/reduced-price lunch (FRL) Missing: FRL Ethnicity white Parental education high school or less Missing: Parental education Age in years Limited English proficiency Gifted student Learning disadvantaged Teacher experience: 0 years Teacher experience: 1-2 years Teacher experience: 3-5 years Teacher experience: 6-11 years Class size School: Fraction FRL School: Fraction white Grade 4 Grade 5 Teacher fixed effects School fixed effects Year dummies Number of classrooms Number of teacher fixed effects Number of school fixed effects

(1) Math

(2) Reading

coeff se 0.237 0.004 0.383 0.004 0.016 0.002 0.012 0.002 0.010 0.004 0.006 0.003 -0.002 0.008 -0.192 0.007 -0.029 0.008 0.228 0.009 -0.052 0.005 0.011 0.006 -0.133 0.006 0.002 0.014 0.503 0.008 -0.191 0.013 -0.086 0.006 -0.020 0.005 0.001 0.004 0.003 0.003 -0.009 0.000 0.091 0.015 0.156 0.022 0.116 0.007 0.247 0.012 Yes Yes Yes 122,961 33,857 722

coeff se 0.362 0.003 0.268 0.003 0.018 0.002 0.006 0.002 0.042 0.003 0.004 0.003 0.101 0.007 -0.198 0.006 -0.025 0.007 0.235 0.008 -0.023 0.005 -0.006 0.005 -0.112 0.005 -0.295 0.013 0.387 0.007 -0.482 0.012 -0.048 0.005 -0.014 0.004 0.000 0.004 0.002 0.003 -0.007 0.000 0.065 0.013 0.189 0.019 0.095 0.006 0.198 0.011 Yes Yes Yes 122,961 33,857 722

The table shows the coefficients on the observed characteristics from estimating Model 17. The dependent variables are math test scores (specification 1) and reading test scores (specification2). Test scores are standardized at the yearby-grade level. Analytic standard errors are in parentheses. The model includes both teacher and school fixed effects. School fixed effects are computed for schools with at least 10 in- or outmoves in the sample period. All remaining schools are pooled into the reference category.

WORKING PAPER SERIES N.99 - AUGUST 2017

41

Table A.7: Variance decomposition: math

(1)

(2)

All teachers Number of scools Number of teacher effects Avg. test score Var(test score)

657 22,299 0.003 0.241

(3)

(4)

Movers only

657 22,299 0.003 0.241

657 5,630 0.017 0.238

657 5,630 0.017 0.238

(I) Fraction explained by variances (main effects)

0.188

77.8%

0.179

75.3%

Var(student quality) Var(teacher value-added) Var(teacher experience) 2*Cov(experience, teacher value-added) Var(school effects) Var(school composition) 2*Cov(school effects, school composition)

0.127 0.045 0.000 0.000 0.015 0.001 -0.001

52.5% 18.7% 0.2% 0.1% 6.1% 0.4% -0.3%

0.126 0.036 0.000 0.000 0.015 0.001 -0.001

53.2% 15.4% 0.1% 0.1% 6.4% 0.4% -0.2%

(II) Fraction explained by covariances (sorting)

0.039

16.3%

0.039

16.4%

(II.A) Teacher sorting to schools and students 2*Cov(teacher value-added, student quality) 2*Cov(teacher experience, student quality) 2*Cov(teacher value-added, school effects) 2*Cov(teacher experience, school effects) 2*Cov(t. value-added, school composition) 2*Cov(experience, school composition)

0.016 0.020 0.002 -0.006 0.000 0.000 0.000

6.7% 8.4% 0.7% -2.6% 0.1% 0.1% 0.0%

0.014 0.017 0.001 -0.004 0.000 0.000 0.000

6.0% 7.2% 0.4% -1.8% 0.1% 0.1% 0.0%

(II.B) Student sorting to schools 2*Cov(school effects, student quality) 2*Cov(school composition, student quality)

0.023 0.016 0.007

9.5% 6.5% 3.0%

0.025 0.017 0.008

10.4% 7.1% 3.3%

(III) Remaining variance and covariance terms

0.014

5.9%

0.020

8.3%

-0.017 0.031

-7.0% 12.9%

-0.017 0.037

-7.1% 15.4%

Vars and Covars(other controls) Var(error term)

The table shows results for the variance decomposition, based on panel regressions (Model 17). The outcomes are math end-of-grade test scores. All specifications control for year and grade dummies. In these specifications, school fixed effects are obtained only for schools with at least 10 moves in total. The remaining schools are pooled into the reference category, and the results are presented for schools with at least 10 moves. The table presents different specifications: Columns (1)-(2) present the results from the original specification for the full sample. Columns (3)-(4) present the results only for the teachers who move at least once during the sample period (“movers”).

WORKING PAPER SERIES N.99 - AUGUST 2017

42

Table A.8: Variance decomposition: reading

(1)

(2)

All teachers Number of schools Number of teacher effects Avg. test score Var(test score)

657 22,299 -0.008 0.210

(3)

(4)

Movers only

657 22,299 -0.008 0.210

657 5,630 0.000 0.205

657 5,630 0.000 0.205

(I) Fraction explained by variances (main effects)

0.160

76.5%

0.153

74.7%

Var(student quality) Var(teacher value-added) Var(teacher experience) 2*Cov(experience, teacher value-added) Var(school effects) Var(school composition) 2*Cov(school effects, school composition)

0.124 0.025 0.000 0.000 0.011 0.002 -0.002

59.0% 12.1% 0.1% 0.1% 5.1% 0.9% -0.7%

0.123 0.019 0.000 0.000 0.011 0.002 -0.001

59.9% 9.1% 0.1% 0.1% 5.3% 0.9% -0.7%

(II) Fraction explained by covariances (sorting)

0.038

18.3%

0.036

17.7%

(II.A) Teacher sorting to schools and students 2*Cov(teacher value-added, student quality) 2*Cov(teacher experience, student quality) 2*Cov(teacher value-added, school effects) 2*Cov(teacher experience, school effects) 2*Cov(t. value-added, school composition) 2*Cov(experience, school composition)

0.014 0.016 0.001 -0.004 0.000 0.001 0.000

6.7% 7.8% 0.4% -2.0% 0.0% 0.4% 0.0%

0.011 0.013 0.001 -0.003 0.000 0.000 0.000

5.3% 6.3% 0.3% -1.5% 0.0% 0.2% 0.0%

(II.B) Student sorting to schools 2*Cov(school effects, student quality) 2*Cov(school composition, student quality)

0.024 0.010 0.014

11.6% 4.7% 6.9%

0.025 0.010 0.015

12.3% 4.9% 7.4%

(III) Remaining variance and covariance terms

0.011

5.3%

0.016

7.7%

-0.014 0.025

-6.7% 11.9%

-0.013 0.029

-6.5% 14.2%

Vars and Covars(other controls) Var(error term)

The table shows results for the variance decomposition, based on panel regressions (Model 17). The outcomes are reading end-of-grade test scores. All specifications control for year and grade dummies. In these specifications, school fixed effects are obtained only for schools with at least 10 moves in total. The remaining schools are pooled into the reference category, and the results are presented for schools with at least 10 moves. The table presents different specifications: Columns (1)-(2) present the results from the original specification for the full sample. Columns (3)-(4) present the results only for the teachers who move at least once during the sample period (“movers”).

WORKING PAPER SERIES N.99 - AUGUST 2017

43

Table A.9: Heterogeneity in quality of schools, teachers, and students across regions

(1) All (I) Math test scores Avg. Avg. Avg. Avg. Avg. Avg.

test score (outcome) student quality teacher value-added experience effect school effect qual. school composition

(2) large

(3)

(4)

(5)

(6)

City mid-size

Urban fringe

Town

Rural

0.003 -1.286 1.222 -0.007 0.001 0.125

-0.116 -1.375 1.227 -0.008 -0.006 0.094

-0.046 -1.324 1.218 -0.007 0.008 0.110

0.087 -1.210 1.224 -0.007 -0.001 0.137

-0.037 -1.351 1.219 -0.006 0.014 0.130

0.052 -1.248 1.223 -0.006 -0.009 0.146

-0.008 -1.045 0.893 -0.004 0.014 0.132

-0.146 -1.137 0.880 -0.005 0.021 0.091

-0.051 -1.086 0.894 -0.004 0.031 0.112

0.072 -0.969 0.890 -0.004 0.008 0.150

-0.043 -1.100 0.897 -0.004 0.019 0.135

0.042 -1.004 0.899 -0.004 -0.008 0.158

(II) Reading test scores Avg. Avg. Avg. Avg. Avg. Avg.

test score (outcome) student quality teacher value-added experience effect school effect qual. school composition

This table shows the averages of teacher, student, and school quality across regions. Based on Model 17.

WORKING PAPER SERIES N.99 - AUGUST 2017

44

Table A.10: Heterogeneity: math (only movers)

(1)

(2) City large midsize Number of school effects Number of teacher effects Avg. test score Var(test score)

(3) Urban fringe

(4) Town

(5) Rural

63 798 -0.087 0.329

197 2,099 -0.056 0.258

162 1,972 0.109 0.213

89 970 -0.027 0.225

146 1,830 0.083 0.176

(I) Fraction explained by variances (main effects)

70.8%

76.1%

74.8%

74.7%

77.2%

Var(student quality) Var(teacher value-added) Var(teacher experience) 2*Cov(experience, teacher value-added) Var(school effects) Var(school composition) 2*Cov(school effects, school composition)

51.4% 12.0% 0.1% 0.1% 6.4% 0.1% 0.8%

56.2% 15.0% 0.1% 0.1% 4.6% 0.3% -0.2%

50.2% 16.4% 0.2% 0.1% 8.1% 0.3% -0.5%

52.2% 16.3% 0.1% 0.1% 5.8% 0.3% -0.2%

49.8% 19.2% 0.1% 0.1% 8.5% 0.4% -0.9%

(II) Fraction explained by covariances (sorting)

20.6%

16.5%

17.4%

17.5%

9.0%

(II.A) Teacher sorting to schools and students 2*Cov(teacher value-added, student quality) 2*Cov(teacher experience, student quality) 2*Cov(teacher value-added, school effects) 2*Cov(teacher experience, school effects) 2*Cov(t. value-added, school composition) 2*Cov(experience, school composition)

1.3% 2.3% 0.3% -1.5% 0.1% 0.2% 0.0%

7.4% 8.0% 0.4% -1.2% 0.1% 0.1% 0.0%

5.0% 6.7% 0.5% -2.1% 0.1% -0.2% 0.0%

10.0% 10.4% 0.5% -1.0% 0.0% 0.0% 0.0%

7.3% 9.9% 0.5% -3.8% 0.1% 0.5% 0.0%

(II.B) Student sorting to schools 2*Cov(school effects, student quality) 2*Cov(school composition, student quality)

19.3% 16.9% 2.4%

9.1% 5.7% 3.4%

12.4% 10.6% 1.8%

7.5% 4.3% 3.1%

1.8% 0.1% 1.6%

(III) Remaining variance and covariance terms Vars and Covars(other controls) Var(error term)

8.6%

7.5%

7.8%

7.9%

13.7%

-8.0% 16.5%

-6.3% 13.7%

-7.7% 15.5%

-8.4% 16.3%

-5.1% 18.8%

The table shows results for the variance decomposition, based on panel regressions (Model 17), restricting the sample to high-turnover schools and to teachers who move at least once during the sample period. The outcomes are math end-of-grade test scores. All specifications control for year and grade dummies. In these specifications, school fixed effects are obtained only for school with at least 10 moves in total. The remaining schools are pooled into the reference category, and the results are presented for schools with at least 10 moves. The results are presented for different types of regions, based on the location of the school.

WORKING PAPER SERIES N.99 - AUGUST 2017

45

Table A.11: Heterogeneity: reading (only movers)

(1) large

(2) City midsize

(3) Urban fringe

(4) Town

(5) Rural

63 798 -0.124 0.292

197 2,099 -0.070 0.235

162 1,972 0.095 0.177

89 970 -0.038 0.187

146 1,830 0.066 0.138

(I) Fraction explained by variances (main effects)

71.0%

72.5%

74.2%

75.9%

81.1%

Var(student quality) Var(teacher value-added) Var(teacher experience) 2*Cov(experience, teacher value-added) Var(school effects) Var(school composition) 2*Cov(school effects, school composition)

57.1% 7.5% 0.0% 0.1% 4.8% 0.4% 1.1%

59.3% 9.0% 0.0% 0.1% 3.7% 0.6% -0.3%

58.7% 9.2% 0.1% 0.1% 6.2% 0.7% -0.7%

60.2% 10.2% 0.1% 0.1% 5.0% 0.7% -0.4%

61.7% 12.0% 0.1% 0.0% 8.4% 0.8% -2.0%

(II) Fraction explained by covariances (sorting)

23.4%

20.7%

18.2%

15.9%

4.0%

(II.A) Teacher sorting to schools and students 2*Cov(teacher value-added, student quality) 2*Cov(teacher experience, student quality) 2*Cov(teacher value-added, school effects) 2*Cov(teacher experience, school effects) 2*Cov(t. value-added, school composition) 2*Cov(experience, school composition)

2.9% 3.8% 0.2% -1.4% 0.1% 0.3% 0.0%

8.6% 8.6% 0.2% -0.7% 0.0% 0.4% 0.0%

4.2% 5.2% 0.3% -1.3% 0.1% -0.2% 0.0%

4.7% 5.9% 0.3% -1.5% 0.0% -0.1% 0.0%

3.7% 7.1% 0.4% -4.2% 0.0% 0.4% 0.0%

(II.B) Student sorting to schools 2*Cov(school effects, student quality) 2*Cov(school composition, student quality)

20.5% 14.4% 6.1%

12.1% 4.9% 7.2%

14.0% 8.9% 5.1%

11.3% 4.3% 7.0%

0.3% -4.4% 4.7%

Number of school effects Number of teacher effects Avg. test score Var(test score)

(III) Remaining variance and covariance terms Vars and Covars(other controls) Var(error term)

5.6%

6.8%

7.6%

8.2%

14.9%

-8.7% 14.3%

-5.9% 12.7%

-6.8% 14.4%

-7.7% 15.9%

-3.5% 18.5%

The table shows results for the variance decomposition, based on panel regressions (Model 17), restricting the sample to high-turnover schools and to teachers who move at least once during the sample period. The outcomes are math end-of-grade test scores. All specifications control for year and grade dummies. In these specifications, school fixed effects are obtained only for school with at least 10 moves in total. The remaining schools are pooled into the reference category, and the results are presented for schools with at least 10 moves, and all teachers in these schools during the sample period. The results are presented for different types of regions, based on the location of the school.

WORKING PAPER SERIES N.99 - AUGUST 2017

46

Table A.12: Teacher sorting on student quality across and within schools

(1) All

(2) large

(3) City mid-size

(4)

(5)

(6)

Urban fringe

Town

Rural

(I) Math test scores Var(testscore)

0.241

0.314

0.267

0.225

0.222

0.178

2*Cov(teacher effects, school effects) across school across-classroom-within-school

0.020 0.013 0.007

0.015 0.010 0.005

0.023 0.014 0.009

0.018 0.010 0.008

0.023 0.017 0.007

0.020 0.014 0.005

Percent contribution 2*Cov(teacher effects, school effects) across school across-classroom-within-school

8.4% 5.4% 3.0%

4.8% 3.2% 1.6%

8.8% 5.4% 3.4%

8.0% 4.4% 3.6%

10.4% 7.5% 2.9%

10.9% 8.1% 2.8%

Var(testscore)

0.210

0.277

0.245

0.189

0.187

0.141

2*Cov(teacher effects, school effects) across school across-classroom-within-school

0.016 0.009 0.007

0.014 0.009 0.005

0.024 0.012 0.013

0.013 0.007 0.006

0.014 0.012 0.002

0.012 0.008 0.004

Percent contribution 2*Cov(teacher effects, school effects) across school across-classroom-within-school

7.8% 4.4% 3.4%

5.1% 3.2% 2.0%

10.0% 4.7% 5.2%

6.7% 3.6% 3.1%

7.7% 6.4% 1.3%

8.2% 5.7% 2.5%

(II) Reading test scores

The table decomposes the covariance of teacher effects (value-added, estimated as teacher fixed effects µ ˆj based on 0 Model 17) and student quality (estimated as the classroom predicted performance, X ct ˆ ) into a between- and withinschool component, for math test scores (Panel I) and reading test scores (Panel II). The between school covariance is 0 0 computed as 2 ⇤ Cov(ˆ µj(s) , X ct ˆ ), and the within-school variance is computed as 2 ⇤ Cov(ˆ µj µ ˆj(s) , X ct ˆ ).

WORKING PAPER SERIES N.99 - AUGUST 2017

47

Table A.13: Math test score distributions under counter-factual teacher assignment policies

test score

(1) mean

(3) 25th pctile

(4) 75th pctile

(5) difference pc75-pc25 (I) Original sorting

original

0.021

-0.298

0.339

(6) 10th pctile

(7) 90th pctile

(8) difference pc90-pc10

-0.595

0.624

1.219

0.581 -0.043 -

1.131 -0.089 -7%

0.544 -0.080 -

1.055 -0.164 -13%

0.529 -0.095 -

1.026 -0.193 -16%

0.638

(II) Simulation 1: Random within schools simulated diff to orig. relative change

0.021 0.000 -

simulated diff to orig. relative change

0.021 0.000 -

simulated diff to orig. relative change

0.021 0.000 -

-0.274 0.024 -

0.314 -0.026 -

0.588 -0.050 -8%

-0.550 0.046 -

(III) Simulation 2: Random within districts -0.259 0.039 -

0.293 -0.047 -

0.552 -0.086 -13%

-0.511 0.084 -

(IV) Simulation 3: Random within state -0.250 0.049 -

0.284 -0.055 -

0.534 -0.104 -16%

-0.497 0.098 -

This table shows simulation results for math test scores. I simulate the impact of different counter-factual teacher assignments on the distribution of test scores across classrooms. I consider three counterfactual assignments: random allocation of teachers within schools (panel II), random allocation of teachers within school districts (panel III), and random allocation of teachers within the whole state of North Carolina (panel IV). Panel I presents summary statistics of the original distribution. The simulations are based on 100 random teacher draws (without replacement), and the results are averaged across all random draws. For details on the procedure, see also Section 5.5.

WORKING PAPER SERIES N.99 - AUGUST 2017

48

Table A.14: Reading test score distributions under counter-factual teacher assignment policies

test score

(1) mean

(3) 25th pctile

(4) 75th pctile

(5) difference pc75-pc25 (I) Original sorting

original

0.011

-0.278

0.310

(6) 10th pctile

(7) 90th pctile

(8) difference pc90-pc10

-0.567

0.563

1.130

0.541 -0.022 -

1.077 -0.053 -5%

0.541 -0.045 -

1.077 -0.098 -9%

0.503 -0.060 -

1.002 -0.128 -11%

0.588

(II) Simulation 1: Random within schools simulated diff to orig. relative change

0.011 0.000 -

simulated diff to orig. relative change

0.011 0.000 -

simulated diff to orig. relative change

0.011 0.000 -

-0.266 0.012 -

0.293 -0.016 -

0.559 -0.029 -5%

-0.536 0.031 -

(III) Simulation 2: Random within districts -0.266 0.023 -

0.293 -0.030 -

0.559 -0.053 -9%

-0.536 0.054 -

(IV) Simulation 3: Random within state -0.247 0.031 -

0.271 -0.039 -

0.518 -0.070 -12%

-0.499 0.068 -

This table shows simulation results for reading test scores. I simulate the impact of different counter-factual teacher assignments on the distribution of test scores across classrooms. I consider three counterfactual assignments: random allocation of teachers within schools (panel II), random allocation of teachers within school districts (panel III), and random allocation of teachers within the whole state of North Carolina (panel IV). Panel I presents summary statistics of the original distribution. The simulations are based on 100 random teacher draws (without replacement), and the results are averaged across all random draws. For details on the procedure, see also Section 5.5.

WORKING PAPER SERIES N.99 - AUGUST 2017

49

Unicredit & Universities Knight of Labor Ugo Foscolo Foundation Piazza Gae Aulenti, UniCredit Tower - Torre A 20154 Milan Italy

Giannantonio De Roni - Secretary General [email protected]

Annalisa Aleati - Scientific Director [email protected]

Rosita Ierardi [email protected]

Info at: [email protected] www.unicreditanduniversities.eu

WORKING PAPER SERIES N.99 - AUGUST 2017

50

1

Suggest Documents