11 June 2013

Tema 2 Oppgavegiveren Meeting user needs cost-effectively Toward increased coherence in the International Trade by Enterprise Characteristics (TEC) framework Nordic Statistical Association 2013 Jon Mortensen ([email protected]) Head of section, External Economy, Statistics Denmark Søren Burman ([email protected]) Head of section, External Economy, Statistics Denmark

Abstract In recent years there has been a growing demand from users for disaggregated data on the characteristics of internationally trading enterprises. As a respond Eurostat’s Trade by Enterprise Characteristics (TEC) database, which provides data on international trade at the enterprise level by linking International Trade in Goods Statistics (ITGS) with the Statistical Business Register (SBR). Despite a high degree of coherence several methodological challenges regarding reliable and complete integration of data at the micro level result in unmatched data. Also, TEC ignores an increasingly important aspect of international trade, namely trade in services. The results discussed in this paper confirms that user demands for new types of data can be met cost effectively by exploiting synergies between already existing statistics through micro data linking. However, the paper also shows that incoherence and methodological challenges need to be carefully addressed to secure robustness.

1. Introduction In recent years there has been a growing demand from users for disaggregated data on the characteristics of internationally trading enterprises. As a respond OECD and Eurostat have jointly launched the Trade by Enterprise Characteristics (TEC) database, which provides data on international trade at the enterprise level by linking International Trade in Goods Statistics (ITGS) with the Statistical Business Register (SBR). Despite a high degree of coherence several methodological challenges regarding reliable and complete integration of data at the micro level result in unmatched data. Two root causes for incoherence between the unit structure in the Danish ITGS and the unit structure in the Danish SBR makes matching populations between the two problematic. The first root cause is treatment of complex enterprises, i.e. enterprises comprised by more than one legal unit. Complex enterprises often report trade data on a different legal unit than the one they use to report other data, such as number of employees, and, thus, compromise the quality of the match between the ITGS and the SBR. The second root cause concerns the allocation of estimated trade (for non-response and trade below threshold) to individual enterprises. This paper analysis the scope of these root causes and explore possible solutions. Also, by focusing solely on goods, TEC ignores an increasingly important aspect of international trade, namely trade in services. One of the major challenges associated with adapting the existing TEC framework from ITGS for the International Trade in Services Statistics (ITSS) is the data collection method. For Denmark the ITGS is collected by the custom authorities and in the Intrastat survey, and have almost all trade in goods collected by direct reports from the enterprises engaged in trade, where the ITSS is a survey based statistic where only a fraction of the enterprises are reporting their direct trade. Linking the directly reported trade is quite straightforward, since the ITSS survey and the SBR use the same unit for enterprises, but the real challenge is the linking of the estimated trade. This paper will look into the challenge of linking the Danish ITSS data with SBR in order to mimic the tables in the TEC framework. 1/6

2. Method The paper presents the results from two EU funded studies; one on the existing TEC statistics and one on a potential future TEC statistics for services (STEC). In the following the methods used in each study are described.

2.1 Increases coherence in the TEC statistic To establish the scope of the issue, “Complex enterprises” were identified by use of the superior number (UHO-system) which is unique to the Danish ITGS, and matched with the SBR, and subsequently analyzed. The UHO system provides an automated method for identifying complex enterprises and tracking them over time by providing linkage of legal units that are financially interrelated over time through the allocation of a superior number (UHO-number). December 2010 was chosen as referencemonth partly because it would make the results comparable with a preceding study, partly because it would avoid problems related to changes in populations over time (analyzing only one population, the December 2010, instead of 12 were the analysis to cover a whole year). This does not affect the relevance or robustness of the analysis and results.

2.2 Adapting the TEC framework on the ITSS In Denmark the ITSS is based on an ITS survey with reports from enterprises, and other sources that cannot be associated directly with an enterprise. This limits the linking of ITSS with SBR to the survey, where enterprises can be identified, and will cover a little more than 80 pct. of the total trade in services. The ITS survey consist of roughly 1500 PSI’s and is composed of a selection of the largest service traders and a random sample of enterprises that has been selected from a population that are stratified by size and activity, so each enterprise represent a number of similar enterprises from the same stratum. In order to link the estimated trade with other statistics at the micro level it has to be distributed to the represented enterprises. A straightforward method to distribute the estimated trade is to distribute it to the enterprises that are represented in each stratum using an estimator from the SBR such as employees or turnover, making the trade proportional to these variables. One should have in mind, that the ITSS sample is optimized to give the best estimate for the international trade in services (ITS) as a whole and therefore some of the strata are more detailed than others, i.e. some enterprises represent more enterprises than other. This is especially evident in strata with little trade in services and small enterprises implying that a lot of small enterprises are represented by a few enterprises. A second issue is a vintage related one, where the ITSS population is composed of a snapshot of the enterprises that were active in 2007 and had a turnover over a threshold 5 000 mill. DKK. That means that a number of enterprises that were active in 2007 no longer are active or have shifted the threshold line and new enterprises may have entered. Therefore a lot of enterprises that has changed since 2007 will not be represented by the ITSS population. Over time the mismatch between the enterprises in the ITSS population and the enterprises in the SBR will be increasing, until an update of the ITSS population is performed.

3. Results 3.1 Increased coherence in the TEC statistic As shown in Table 1 in the annex, 439 complex enterprises were identified in the ITGS. The main findings of an analysis of these cases were:  A majority of the identified complex enterprises contains only a few legal units and more than half (291 of 439) only two.  Frequently, complex enterprises report trade on a legal unit with no or very few employees (and often other legal units within the same complex enterprise have high numbers of employees but no trade).

2/6



The 439 complex enterprises comprise a substantial amount of total trade – 25.7 % of exports and 22.2 % of imports. Currently, only about 10 pct. of the total sum of estimated trade is allocated to specific UHOnumbers, i.e. estimated trade is currently impossible to allocate to the enterprise level. It must be emphasised though that the estimation-process as such is based on individual UHO-numbers. It is the subsequent allocation of estimated trade to specific countries and goods, which impedes the allocation of such trade to individual enterprises.

3.2 Adapting the TEC framework on the ITSS When linking the ITSS population, which is a snapshot of the SBR 2007, with the SBR for 2010 the match percentage of the ITSS population is 86.2 pct. 3710 enterprises that were active in 2007 were not in the SBR for 2010. This is probably due to the recent debt crisis where a lot of enterprises has defaulted or been merged with other enterprises. The number of enterprises not represented by the ITSS population is 4501, which includes new and merged enterprises and smaller enterprises that have crossed the threshold value. Matching ITSS population with SBR Enterprises not found in SBR 2010 Total number Enterprises in 2007 snapshot Enterprises in SBR 2010 not represented by ITSS population

3 710 40 653 4 501

Enterprises matching SBR 2010 and ITSS population

35 063

Share of total enterprises in 2007 snapshot

86.2 %

For the 35063 enterprises that have a match in the SBR a micro data linkage can be performed, which means that variables such as activity, employees and turnover can be matched with the trade data. In this exercise the estimated trade for the represented enterprises has been distributed by using employees as estimator. If there is no data available for employees in the SBR, the average amount of employees for the given stratum is used instead. In table 2 in the annex the trade for the enterprises in the ITSS population has been aggregated on NACE division level. The amount of trade posted in unknown activity reflects the trade by the 3710 enterprises not found in the SBR 2010, which is 1.3 pct. of the total import and export in the ITTS.

4. Discussions Currently a substantial amount of trade cannot be ideally matched between the ITGS and the SBR. Thus, to ensure that trade is allocated to the correct (or best possible) enterprise in the SBR, a common unit structure between the ITGS and the SBR is needed. For all practical purposes the legal unit appear to provide the most appropriate common unit. This points to the need for trade to be robustly allocated to legal units (in addition to UHO-numbers) in the ITGS. As mentioned two issues make this problematic:  Treatment of complex enterprises  Allocation of estimated trade to individual enterprises In December 2010, 439 complex enterprises with more than one legal unit comprising a substantial amount of total trade could be identified (see table 1). As also shown, the SBR’s handling of complex enterprises is not fully sufficient (in terms of identified enterprises) for this purpose, while the UHOsystem provides a (although not perfect) automated way to identify complex enterprises in the ITGS. The high-number of complex enterprises combined with a relatively low number of “very complex enterprises” (more than half of the complex enterprises comprised only two legal units) suggests that a two-string approach combining an automated process with a manual “check-and-adjust” follow-up on the most complicated enterprises is appropriate. The automated process in this two-string approach could follow these steps: Step 1: Step 2:

Establish the link between traders and enterprises on a monthly basis. Use the one-to-one and many-to-one linkages to calculate the median trade M (import and export taken together) per person employed per activity group.

3/6

Step 3: Step 4: Step 5:

For all traders in one-to-one or many-to-one linkages between traders and enterprises allocate monthly trade to the one and only found enterprise. For all enterprise in one-to-many linkages between traders and enterprises calculate an estimated trade E by multiplying M with the number of persons employed by the enterprise. For all traders in one-to-many linkages between traders and enterprises allocate trade proportionally to E over all found enterprises.

On completion of the automated process, a pre-identified number of the “very complex enterprises” could be manually assessed and, if necessary, adjusted. Allocation of all estimated trade to the enterprise-level should be secured through an improvement of the current system for allocation of estimated trade to specific countries and goods. If all estimated trade is allocated to specific UHO-numbers, it can then be allocated further to legal units by the above mentioned automated process.

4.2 Adapting the TEC framework on the ITSS When adapting the TEC framework on the ITSS two issues has to be discussed: 1. Representivity of the population and 2. Service trade not reported directly by an enterprise. The first issue of having a snapshot of the population on a given point of time can be defended if one accepts the assumption that the entry and exit of the total population roughly equal. That is, if an enterprise defaults or is merged, and equal enterprise will enter the marked to fill out the gap, meaning that old population represent to total trade of the new population, even though it does not represent the enterprises active in the economy. This assumption is based on a free market with full competition and no entry costs, and might not be very plausible in times with above average fluctuations in the economy, and also present a problem if more detailed information of the enterprises, such as geography and age of the enterprises is desired. The second issue is divided into two areas. The service trade that are not collected by the ITSS survey, such as the travel account and public services, is very difficult to match with the SBR, since this trade is collected on a aggregated level or because it isn’t performed by an enterprise. The other area is the trade of the enterprises represented by the ITSS survey. In this exercise the estimated trade was distributed on the represented enterprises by using the amount of employees, giving enterprises with many employees a larger share of the estimated trade for the given stratum. This has the obvious drawback of giving all enterprises with at least one employee some trade, which means that there are no non-traders in the population. Other surveys indicate that this is clearly not true. Another more serious drawback is the fact the estimated trade inherits the variance of the estimator, in this case the number of employees. This makes the data unsuitable for calculating for example export intensities. This is because the correlation between employees and turnover is quite high and since the export of services is distributed by using employees, the export intensities of the represented enterprises would be around the same level, and not represent the actual export intensity of the enterprise. The feasibility of estimated trade therefore depends on the correlation between the estimator and the trade in services but also the correlation between the estimator and the other variables in interest in the TEC framework, i.e. activity, size and foreign ownership.

5. Conclusion Coherence in the Danish TEC statistics, and in trade micro data linking more generally, would be greatly improved if all trade in goods is robustly allocated to the level of the legal unit instead of just the UHO-level. Currently, this is not the case but a possible solution is outlined above. Adapting the TEC framework on the ITSS is not straightforward due to the methodological differences in the collection of data. However, it is possible to shape the ITSS data into some of the TEC tables, if certain bold assumptions about representivity are made and if the estimator for the distribution of the estimated trade is suitable for the table in question. The results discussed in this paper confirms that user demands for new types of data can be met cost effectively by exploiting synergies between already existing statistics through micro data linking. However, the paper also shows that incoherence and methodological challenges need to be carefully addressed to secure robustness.

4/6

Annex 1 Table 1. "Complex enterprises" in the ITGS Legal units in complex enterprise

complex enterprises ("UHO-groups")

Number

Export share

Import share

Pct. of complex enterprises' trade

Export share Import share Pct. of total trade

2 3 4 5 6 7 8 9 10 12 13 14 15 17 18 20 21 22 30 37 40 47 49 51 52

291 94 41 24 12 9 5 6 4 3 1 3 1 1 1 1 1 2 1 2 1 1 1 1 1

16 16.1 4.3 33.1 3.3 16.3 0.4 2.3 0 0.1 0 0 0 0 0 0 0 0.6 0.2 0 0 0 7.2 0 0.1

20 18.4 16.6 9.1 4.9 6.1 4.9 3.4 0.5 0.2 0 0.1 0 6.6 0 0.1 0 0.1 0.3 0 0 0 8.7 0.1 0

4.1 4.1 1.1 8.5 0.8 4.2 0.1 0.6 0 0 0 0 0 0 0 0 0 0.2 0 0 0 0 1.9 0 0

4.4 4.1 3.7 2 1.1 1.4 1.1 0.8 0.1 0 0 0 0 1.5 0 0 0 0 0.1 0 0 0 1.9 0 0

1 406

439

100

100

25.7

22.2

Table 2. ITSS population – Trade by activity for 2010 NACE rev. 2 Division

A B C D E F G H I J K L M N O P

Agriculture, forestry and fishing Mining and quarrying Manufacturing Electricity, gas, steam and air conditioning supply Water supply; sewerage, waste management and remediation activities Construction Wholesale and retail trade; repair of motor vehicles and motorcycles Transportation and storage Accommodation and food service activities Information and communication Financial and insurance activities Real estate activities Professional, scientific and technical activities Administrative and support service activities Public administration and defence compulsory social security Education

Import

Export

1 000 DKK

720 076 2 127 235 28 099 256 1 830 779 102 853 679 574 12 326 885 135 667 463 523 956 15 613 411 9 668 767 1 073 651 8 843 855 9 083 828 2 739 659 475 920

1 149 996 1 075 557 29 167 990 2 025 810 250 789 373 834 21 171 280 193 802 038 C 15 137 626 6 386 485 518 929 13 190 761 5 571 938 1 345 873 221 081

5/6

Q R S U

Human health and social work activities Arts, entertainment and recreation Other service activities Activities of extraterritorial organisations and bodies Unknown activity

25 798 368 794 253 755 C

274 022 303 333 303 107 C

3 900 061

4 496 905

6/6