Modernising the data collection of the LFS

COMMISSION OF THE EUROPEAN COMMUNITIES EUROSTAT Directorate F: Social and information society statistics Unit F-2: Labour market 6th Workshop on LFS ...
Author: Gregory Wells
4 downloads 0 Views 69KB Size
COMMISSION OF THE EUROPEAN COMMUNITIES EUROSTAT Directorate F: Social and information society statistics Unit F-2: Labour market

6th Workshop on LFS methodology, Wiesbaden, 12-13 May 2011

Modernising the data collection of the LFS Johan van der Valk (Eurostat) Subject A: Data collection strategy

Summary Eurostat has launched a review of the legal basis of the LFS. One crucial element of this process is to see if there is need to modernise the LFS. This modernisation also includes data collection issues. It will be studied how to improve the quality of the LFS data and how to make the data collection more efficient. A sensible strategy seems to be to identify good practices based on the experiences of the last decade and to make more use of modern methods for sampling, data collection and data processing. In this presentation some ideas are presented with a focus on issues that are related to data collection.

1. Introduction The current system of regulations has enabled the LFS to pass to a continuous quarterly survey in all Member States and established it as the main statistical information source on most labour market issues. After more than ten years of operation under the current system, the occasion should be seized to take stock and assess the experiences with the current legal basis. In addition, user needs have changed and grown over time, as demonstrated by the Europe 2020 strategy, the Commissions 'GDP and beyond' initiative, the Stiglitz-Sen­ Fitoussi-report or the migration statistics main-streaming framework. Furthermore, new strategic ESS initiatives need to be taken into account. Commission Communication 404/2009 on the 'production method of EU statistics: a vision for the next decade' makes a strong case for changing the production of statistics based on new insights and inspired by new technologies. For these reasons a thorough review of the LFS framework is necessary. The new legal basis should reflect a modernisation of the LFS. In this paper some ingredients are presented of the LFS of the future are presented for discussion. Focus of this paper is on data collection issues. All ideas presented here are preliminary. They still need to be thoroughly assessed in the coming period.

2. Basic strategy One could argue that users on one side and producers of labour market statistics on the other side have somewhat different priorities on what should be improved in case of the LFS. Users want first of all more data and better quality data. For the producers, the NSIs, a major issue is to increase the efficiency of the LFS. The current EU LFS data set is already considered large, precision requirements are seen as demanding and the costs of data collection are considered to be high. Because of increasing budgetary pressure, NSI's would prefer the same quantity of output with fewer costs or less EU output if this would imply fewer resources needed. Quality of output is also of great concern for NSIs. For instance, the trend of declining response rates 1

and suspicion of higher selectiveness of the response is seriously undermining the trust in the LFS as the source for a sensitive indicator like the unemployment rate. This forces NSIs to take actions to stop this trend. To strike a right balance between both users’ needs and producers’ concerns an appropriate strategy has to be chosen. The challenge is to be able to expand the quantity and increase the quality of the LFS output with lower costs of data collection and data processing. The objective of the modernisation of the LFS can therefore be formulated as follows: to increase the scope and improve the quality of the labour market data while producing the output more efficiently. This can be achieved by using effective survey designs with advanced data collection techniques.

3. Modern data collection strategies Two basic elements of the modernisation of the LFS data collection are the use of advanced computer assisted data collection methods and effective survey designs with sample sizes not unnecessarily large. Inevitably this makes the survey design more complex. In order to limit costs involved with the management of the survey and processing of the data, a third potential element of the LFS is to introduce a system of modularity. These three points are presented more in detail

3.1.

Computer assisted data collection

Advanced data collection methods imply that they are computer assisted. They can be face-to­ face (CAPI), by telephone (CATI) or web based (CAWI). Using paper questionnaires (PAPI) is no option. Several reasons can be mentioned: the limitations in the questionnaires, quality checks, processing costs and the timeliness. Using only computer assisted data collection modes increases the quality of the output, the efficiency of the data collection and data processing. Countries that currently use PAPI as only mode or one of the modes available should change to computer assisted survey modes. According to the LFS quality report in 2009 still 13 countries use PAPI. These countries should be assisted in the process of transforming into a (full) computer assisted data collection system. At the moment several countries are developing Web based or Internet data collection for social surveys. There are two main reasons behind this innovation. The first is that Internet data collection is cheaper than traditional methods. The second reason is that many countries face decreasing response rates. Internet data collection could be helpful to stop this trend. A further reason, related to the last one, is that the public ask for it. An increasing number of persons have no problems participating in a survey as such, but do not like an interviewer at their door or on the telephone. They prefer to fill out the questionnaire when it is convenient for them. These drivers behind the development will probably gain strength in the near future, resulting in an acceleration of the process. Internet data collection is still under development. Little is known about the dos and don'ts of web questionnaire design and application. For instance, research shows that the lay-out on the screen, the options and the buttons all have a strong influence on the results. The web questionnaires that are in use at the moment for business surveys are often more or less a replica of the paper forms. In particular for household surveys this is probably not the optimal model. Furthermore, several tools like the ones for coding of economic activity or occupation cannot be translated directly into a web version. However the potential of web questionnaires for coding and checking is evident but needs to be further explored. 2

The process of introducing web based data collection will benefit from a coordinated approach at EU level. Programming a web questionnaire is complicated; it involves a substantial amount of resources to develop and test. On top of that a good performance within a secure IT-environment has to be guaranteed. This will be a continuous task. In addition, it is not evident how to introduce CAWI in a multi-mode design. What are the benefits and limitations of adding this mode to the current ones? What is the quality of the data? Which estimation method should be applied? All these questions have to be answered. Member States have started to develop tools, applications and procedures. It would be efficient to collaborate and share experiences of countries that have already started. Eurostat will stimulate the active collaboration among the Member States by launching an ESSNet project  Data collection for social surveys using multiple modes. The project is envisaged to run over a 3-year-period starting in 2012. The main objectives of the project are: the efficient development of new data collection tools in the ESS by exchange and sharing of experiences and the early harmonisation by establishing best practices. The project could also indicate the necessity of a common development and maintenance of data collection tools. The Labour Force Survey (LFS) will be used to analyse challenges and potential pitfalls. This survey is selected because of two reasons. It is a large scale survey with a high level of harmonisation among the Member States and therefore the potential cost savings are substantial. Furthermore, the LFS is already designed as a mixed mode survey in many countries; this allows the smooth introduction of a new data collection mode.

3.2.

Effective data collection designs

An important element of effective data collection in case of the LFS is to use a rotating panel design with substantial overlap between quarters and years. The use of a rotating panel design reduces the volatility of the time series which is essential for indicators like the unemployment rate. Substantial overlap between quarters and years is necessary firstly to ensure the effectiveness of the panel design and secondly to allow for producing statistics on labour dynamics which is one of the main users' needs. Currently, most countries have such a panel design in order to accommodate the users. The rotational patterns are already quite similar. An issue to discuss is to what extent the rotational patterns should be further harmonised. If less diversity is desirable has to be decided upon based on output requirements. At the moment in the implementing LFS regulation a distinction is made between annual and quarterly variables. This allows countries use subsamples for the annual variables by including them in only one or two waves, the so-called wave approach. This is evidently efficient since variables that are not published quarterly do not require large samples. This system should be extended. Currently only a limited number of countries use the wave approach. More if not all countries should seriously consider this approach. The new legal basis should facilitate this more than the current one. The list of LFS variables should be carefully reviewed with respect to the distinction of quarterly and structural (annual) variables, with a view to having more of those variables not needed to assess the labour force status pass to structural status. This should make the wave approach more attractive as a means of reducing the interview costs. Furthermore, experiences how to apply this method in practice should be shared between member states. The second extension of making sub sampling more attractive is to lower the frequency of some variables to even less than annual. Defining multi-annual variables seems a valid option. Many aspects of the labour market change slowly. This makes a multi-annual frequency a valid option.

3

Sample sizes are determined by the precision needed for the output indicators. The current precision requirements in the LFS regulation need clarification in order to translate them into required sample sizes. But apart from clarification changes could go beyond this. It could for instance include requirements for sub populations other than the total unemployed. Furthermore, the differentiation of the requirements depending on the size of a country's population needs also to be critically assessed. Another, less straightforward, element of effective data collection is to re-consider the current situation to have the household as primary unit of observation for all waves of the LFS. Collection of information for all persons in the households is efficient in case of face-to-face data collection modes with sampling frames of addresses. A problem with face-to-face (CAPI) data collection mode is that it is relatively costly. For this reason other computer assisted modes like telephone interviewing (CATI) is used extensively. Recently, as mentioned above even web-based data collection (CAWI) is being considered for the LFS. The latter two types of data collection modes involve no travelling costs. This makes the efficiency argument, that if you have to make substantial costs to reach a household it is better to collect as much data as possible, less valid. Another issue where the CAPI data collection mode on one side is different from CATI or CAWI on the other side is the duration of the interview. For the latter two modes having short interviews is essential in collecting high quality data. For CAPI it is not uncommon of respondents to invite interviewers in their home. This process takes several minutes. To subsequently carry out an interview of only a few minutes is a bit awkward. This is a second reason why interviewing all persons in the household is quite appropriate with a CAPI collection mode unlike with CATI or CAWI as data collection mode. Not all labour market information is needed on household level. A limited set of variables is sufficient. In addition not a full sample is needed. For household information quarterly data or dynamic analysis seems not to be essential which would allow for limited sample sizes. For these reasons it could be considered to have the LFS as a representative sample survey of individuals. The individual sample person in the household is called the key person. For this person all information should be collected. For the other persons in the household a limited set of variables suffices. Information on other household members can be still collected for the whole sample but it would be more efficient to collect this data for a subsample only. If the main unit of observation of the LFS is a person rather than a household it would make sense not to allow proxy answering for the key person. For the other persons in the household proxy answering would be allowed. This increases the quality of the measurement of several variables. In addition, it generates new opportunities for output like introducing more subjective variables in the LFS. A possible design within this regime could be to use the first wave to collect information for all persons in the household using for instance a CAPI mode. During this interview a key person is selected. The subsequent waves could be carried by CATI or CAWI collecting information on the key person only. The argumentation for the sample of persons and how this can be achieved in case of a sampling frame of addresses is more extensively explained in a paper written for the 2010 workshop LFS methodology in Paris.1

1

Link to the paper: http://circa.europa.eu/Members/irc/dsis/employ/library?l=/workshops_methodology/paris_2010/papers/wlfsm­ van-der-valk-paper/_EN_1.0_&a=d

4

Administrative data as input to the LFS has potential to limit respondent burden and costs. For this reason this could be facilitated more extensively. This data can be used for (additional) data collection or for data processing like quality checks or weighting to increase accuracy of the estimates. Using administrative sources is not without quality issues. Important aspects that need to be looked into are cross-country comparability, coverage and timeliness. It needs to become clearer what are the merits and pitfalls in using administrative data.

3.3.

Modules in the LFS

The role of modules in the LFS needs to be strengthened. In order to introduce modularity2 in the LFS modules need to be defined as sets of coherent variables. These modules can be treated as separate units. Modularity serves several purposes. Firstly, it simplifies communication between all parties concerned. Instead of discussing and handling separate variables, more or less homogeneous groups of variables can be dealt with. Secondly, it simplifies the management of the survey. A module of variables can be associated with a questionnaire module that is handled as a separate unit for implementing modules and during the fieldwork. Subsequently, it can even be handled relatively isolated when processing the data. The transcoding, imputation and quality checks could in principle be carried out for a module independently from the rest of the data set. Thirdly, modularity facilitates making the design if the survey more effective. Namely, in principle it would possible to determine per module: the (sub) sample, frequency, unit of observation and the data collection mode all tailored to the output requirements. Fourthly, modules can be used to increase the flexibility. It makes it possible to define multi-annual modules with various frequencies. Finally, modules can be used for harmonisation between surveys. Once a (questionnaire) module is defined with accompanying transcoding and processing rules for the LFS it can be applied to other surveys as well. Per module the level of (input) harmonisation can be determined. All these advantages of introducing modularity clearly show that it is an essential tool to increase the quality and efficiency of the LFS for both data collection and data processing. If the sample size, frequency and unit of observation are determined per module resulting in much variation there is a risk to that the LFS becomes quite complex. That should be avoided. This implies that some restrictions should be applied without making it too tight. There must be a reflection upon what would be the optimal approach. During the ongoing evaluation of the current system of LFS ad hoc modules it was proposed to transform it into a system of supplementary modules, repeated in a regular multi-annual cycle. Such a system could cover (part of) the issues currently addressed by ad hoc modules with better comparability of results over time and across countries while reducing the costs for preparing, implementing, processing and dissemination of the modules. This new approach is nicely consistent with the above described modular structure of the LFS.

4. Some final remarks The ideas on a new LFS framework elements described above are inspired by current practices and recent developments in EU countries. They are therefore realistic proposals. It gives more flexibility to design the LFS more efficiently. The challenge is to make sure that it 2

Modularity is typically defined as a continuum describing the degree to which a system’s components may be separated and recombined. It refers to both the tightness of coupling between components, and the degree to which the “rules” of the system architecture enable (or prohibit) the mixing and matching of components. Wikipedia: http://en.wikipedia.org/wiki/Modularity

5

becomes not substantially more complex than current practices. This is important for countries that have no strong need to reduce data collection costs. For small countries the data collection costs are relatively limited compared to the costs for data processing and survey management. This sets different priorities in increasing the efficiency than is the case for large countries. Allowing sub sampling for example should be no problem for countries not wanting to change the LFS because they are free not to apply it. Rotating panels are applied by almost all countries. Modular design of questionnaires is already applied in several countries and will not need strong adaptation of current practices. There are only two points that require adaptation of current methods for countries. These are the obligation to use only computer assisted data collection modes and not to allow proxy answers for an assigned person in the household. As explained earlier, both relatively limited requirements are essential for quality reasons. Furthermore, it makes the survey more effective, more suitable to produce essential output and ready for future developments.

6

Suggest Documents