THE EFFECT OF DATA COLLECTING AND DATA QUALITY Foreword Pedro Revilla, Instituto Nacional de Estadistica, Spain

STATISTICAL DATA EDITING – Improving Quality 251 Section 3.2 THE EFFECT OF DATA COLLECTING AND DATA QUALITY Foreword – Pedro Revilla, Instituto Na...
Author: Carmel Briggs
4 downloads 1 Views 1016KB Size
STATISTICAL DATA EDITING – Improving

Quality

251

Section 3.2

THE EFFECT OF DATA COLLECTING AND DATA QUALITY Foreword – Pedro Revilla, Instituto Nacional de Estadistica, Spain This section includes five papers about the effect of data collection methods on data quality. The main focus of the papers is the impact of electronic data reporting on the design of the editing strategy. The paper by Nichols et al. presents the United States Census Bureau’s experience with data editing strategies used in business surveys. It describes the interactive editing approach currently incorporated into Computerized Self-Administered Questionnaires (CSAQs ), which are delivered to the respondent by downloadable files or transmitted electronically over the Internet. The paper by Paula Weir presents changes in electronic data reporting options and usage for the U.S. Energy Information Administration. She examines in detail one fully Web-based survey that recently implemented an editing module to better understand the respondents’ views and use of the edit feature. Respondents were asked how they used the edit function and how clear and useful they found the information provided for edit failures. The responses regarding the edit function for this survey are then compared to a study of the edit log that records information each time the edit function is invoked. The paper by Gonzalez et al. explores the possibilities of Web questionnaires to improve editing. It discusses the possibilities and challenges that Web questionnaires offer to the editing tasks, in particular the combination of built-in edits and selective editing approach. Some practical experiences in the Spanish Monthly Turnover and New Orders Survey are presented. The paper by Laroche focuses on the evaluation of the Internet option offered to the respondents in the 2004 Census test. A satisfaction survey of the respondents using the Internet and a follow-up survey of those not using the Internet are described. For the 2006 Census, all households will be offered Internet access. Comparisons between these two data collection methods are presented. The last paper by Ceccarelli and Rosati, (if you have one comma after ‘paper’, then you need another after the authors. Or you can omit any comma) describes the data editing method used for the Italian Labour Force Survey. It presents the data edit and imputation strategy implemented for the whole survey process. A discussion on the main outcomes of the effect of using a combination of the Computer Assisted Personal Interviewing (CAPI) and Computer Assisted Telephone Interviewing (CATI) for data collection is also presented.

252

The Effect of Data Collection on Editing and Data Quality

DESIGNING INTERACTIVE EDITS FOR U.S. ELECTRONIC ECONOMIC SURVEYS AND CENSUSES: ISSUES AND GUIDELINES By Elizabeth M. Nichols, Elizabeth D. Murphy, Amy E. Anderson, Diane K. Willimack and Richard S. Sigman, U.S. Census Bureau, United States Key Words: Electronic reporting, establishment surveys, usability testing, computerized selfadministered questionnaires (CSAQs), data editing, electronic data collection. 1. INTRODUCTION The purpose of this paper is to document the U.S. Census Bureau’s experience with interactive data editing strategies used in collecting data from business survey respondents. Such surveys are subject to many classes of unintentional human error, such as programming mistakes, omissions, miscalculations, typing errors, and interviewer misclassifications. The contribution of each class of inadvertent error to total survey error, however, may be controlled or at least somewhat mitigated by good survey design practice, forethought and planning, and by the advances of computer technology. In the survey context, all such efforts to detect and correct respondent errors fall under the purview of data editing. In mail surveys of establishments, data editing has typically been performed during post-datacollection processing. Computer-assisted establishment surveys, however, may perform data editing interactively, during data collection. For example, in surveys that use computer-assisted telephone interviewing (CATI) or computer-assisted personal interviewing (CAPI), data editing rules, referred to as edit checks, can be incorporated into the CATI/CAPI instrument so that the interviewer is notified of an entry that fails one or more edit checks. The interviewer can then probe the respondent for an alternative response or for verification that the flagged response is correct. Edit checks are also incorporated into computerized self-administered questionnaires (CSAQs), which are delivered to the respondent by mailing disks or CD-ROMs (downloadable) or transmitted electronically over the Internet’s World Wide Web. Browser-based CSAQs are also called online Web-survey or Internet questionnaires. When collecting data using CSAQs, respondents – not interviewers – are notified when a response fails one or more edit checks. The remainder of this paper focuses on interactive edit checks in downloadable and Internet questionnaires. For comparison purposes, Section 2 describes the interactive editing approach currently incorporated into CSAQs collecting economic data from businesses. In Section 3, we offer preliminary guidelines for incorporating edit checks into CSAQs, based on several usability studies of electronic surveys of businesses or organizations. In Section 4, we discuss our findings and propose themes that emerge from the Census Bureau’s approaches to interactive data editing; and we raise some issues for future research in Section 5. 2. EXPERIENCES WITH EDIT CHECKS AND IMPLEMENTATION STRATEGIES FOR U.S. ELECTRONIC ECONOMIC SURVEYS AND CENSUSES CSAQ edit checks prompt respondents to clarify or resolve ambiguous or discrepant responses. Fields containing information that respondents can edit, including name, address and contact information, may be subject to interactive edit checks. Thus, the kinds of edit checks incorporated into the Census Bureau’s economic CSAQs cover a broader range of potential discrepancies than do those conducted during post-collection. Economic CSAQs borrowed the following kinds of edit checks from post-collection edit routines:

STATISTICAL DATA EDITING – Improving

• • • • • • • •

Quality

253

Balance edit: Verifies that the sum of detail data equals the appropriate total. Survey-rule test/Ratio edit: Verifies that the ratio of two data values lies in the range defined by specified minimum and maximum values. Survey-rule test/Logical edit: Verifies that data reported for related items are logically valid. Required item or Missing value/Incomplete edit: Verifies that data have been reported. The following kinds of edit checks tend to be administered only within CSAQs: Preventive edit: Blocks respondents from completing an action, occurring upon the first invalid keystroke. Alphanumeric edit: Verifies that the data meet the proper alphanumeric rules established for that field. Format edit: Verifies that the data have been entered in the expected form (e.g., date entered with dashes instead of slashes). Rounding Test: Checks to see if rounding has occurred in the data. (Some post-collection balance edit checks may use rounding tests.)

CSAQ designers have control over when and how the results of various edit checks are displayed to respondents. Immediate edit checks present edit messages instantly upon detection of discrepant data, and the system prompts the respondent for a correction or explanation. The result of an immediate edit check can be presented as either a pop-up window or an icon located near the questionable item. The results of deferred edit checks are presented to the respondent after the data have been entered and reviewed, usually in a list format. Server-side edit checks employ a type of deferred approach, since data have to be submitted to the server in order to generate the edit message. Two questions are frequently asked in survey development: 1) How many edit checks are too many? 2) Should we allow submission of data with edit failures remaining? It is difficult to devise empirical rules to answer these two questions since each data collection situation is different. Instead we can speak to what we have done, what seems to work, and a general philosophy we have adopted. Table 1 summarizes the editing features of six Census Bureau economic programs offering either downloadable or browser-based CSAQs – the Survey of Industrial Research and Development (R&D), the Manufacturer’s, Shipments, Inventories, and Orders Survey (M3), the 2002 Economic Census, the Company Organization Survey (COS), the Annual Survey of Manufactures (ASM), and the Quarterly Financial Reports (QFR). Table 1 shows that economic programs at the Census Bureau have embedded numerous edit checks into each electronic survey and, in two cases, that the number of edit checks exceeds the number of items collected. Although the number of edits could be related to respondent burden, our experience indicates that respondents are generally receptive to edit checks. In situations with numerous edit checks, respondents could be bombarded with edit failures if they happen to trigger each edit rule. Fortunately this does not typically happen. The question survey designers must address is the likelihood of a respondent triggering an edit check. If a respondent is highly likely to trigger a large number of edit checks, then perhaps the number of edits embedded into the CSAQ should be reduced or they should be tuned to produce fewer “hits.” If it is likely that a large number of edit checks will be triggered, perhaps the edit rules are too strict or there is a problem in the question phrasing or response field placement causing respondents’ answers to trigger multiple edit checks. From the respondent’s perspective, the purpose of edit checks is to help the establishment submit accurate, internally consistent data. Edit checks that do not clearly foster this result may be annoying to respondents. Table 1 also shows that the Census Bureau’s economic programs typically do not require respondents to resolve all edit failures before submission. For the situations where certain fields are

254

The Effect of Data Collection on Editing and Data Quality

required to pass the edit test before the survey can be submitted (also known as “hard” edits), failing to satisfy the edit test results in unit non-response. These items are considered so critical to survey response that missing or inaccurate responses make the submitted form unusable. The philosophy of accepting data with unresolved edit failures stems from two principles: 1) Let the user be in control to the extent possible (a usability principle); and 2) Obtaining some data, even edit-failing data, is better than obtaining no data at all. Since the first principle is respondentcentered and the second is from the survey management perspective, CSAQ editing strategies become a balancing act. Generally, however, respondents want to provide accurate data; thus, if they have not changed data in a way to satisfy the edit failure, we should assume their response is correct in their business context. We also suspect that the more difficult it is to submit data electronically (by difficult we mean that we do not allow submission until all data edit failures are resolved), the more likely is unit non-response. 3. PRELIMINARY GUIDELINES FROM USABILITY RESEARCH ON ORGANIZATIONAL SURVEYS A. Census Bureau Usability Research Methodology for Organizational Surveys To learn about respondent interaction with various approaches to edit-failure notification, the Census Bureau tests candidate editing approaches with respondents. We observe test respondents interacting with edit messages during usability testing. Do respondents recognize edit-failure notifications? Do respondents read the edit messages? If they read the messages, do they understand them? What kind of action do they take regarding the message? A response might consist of ignoring the edit check, modifying data values, or writing a remark to explain why data are accurate even though they failed the edit check. The latter task is particularly characteristic of business surveys since it is not uncommon for valid data to lie outside an expected range. Finally, how easy is it for respondents to interact with the CSAQ to respond to the edit check? For example, respondents may have to navigate to items referred to in the edit message. Because it is virtually impossible to recruit business respondents to travel to the Census Bureau’s usability lab, researchers travel with a video camera and laptop to the business locations to conduct usability testing. With the respondent’s consent, video tape recording allows one or more researchers to analyze the session afterwards; the laptop is a necessary backup in case the CSAQ does not work properly on the respondent’s workstation. Using a think-aloud protocol, a method often used in cognitive testing, we watch and listen as respondents interact with the CSAQ in their offices. Often we use internal staff members as supplemental subjects, since the usability literature typically recommends between five to 12 subjects (e.g., Dumas and Redish, 1999). As few as five subjects is considered sufficient to identify 80 percent of the usability issues (Virzi, 1992). Even with a small number of participants, usability methods are effective in uncovering issues in comprehension and navigation of the CSAQ. Usability testing has its limits, however. Since usability testing uses a small number of subjects, generally from convenience samples, results cannot be tested for statistical significance. Statistical hypothesis testing is not appropriate in a usability-testing context because usability testing is not controlled experimentation. Usability testing is intended to identify problems that actual respondents might have with the CSAQ, not to find significant differences between groups. Further, interview time devoted solely to edit checks is limited since usability testing focuses on the entire instrument. Edit behavior is often not fully functional when usability testing is conducted; such was the case during testing of the 2002 Economic Census prototype. Given these disclaimers, the best practices we recommend for designing edit behavior arise from limited usability testing and should be subjected to additional research.

STATISTICAL DATA EDITING – Improving

Quality

255

B. Preliminary Guidelines The following design guidelines summarize our interpretations of respondent reactions to interactive edits encountered during usability tests. After we state the guideline, we include evidence from our usability tests to support the guideline. In addition to the usability tests performed on surveys listed in Table 1, we also drew from usability reports for two institutional surveys: the Library Media Center Survey (LMC) and the Private School Survey (PSS). At the time of those tests, the LMC and PSS surveys were browser-based. 1) Minimize edit failures through good design. Good questionnaire design includes communicating to respondents which data fields require answers and which are optional. For example, instructions should inform participants to click on “none” if the answer is zero or to enter a number when an entry is required (Bzostek and Mingay, 2001). For dates or amounts, format can be built into fields automatically. Additionally, question text can include instructions on the correct format (Bzostek and Mingay, 2001; Nichols et al., 2001). 2) Perform edit checks immediately unless checking for missing data or performing an inter-item edit. Defer activating those edit checks. Run them either immediately before the respondent finishes the form or after all the items in the inter-item edit have been entered. Study participants preferred immediate notification of edit failures, rather than receiving a list of edit messages at the end (Bzostek and Mingay, 2001; Rao and Hoffman, 1999). Participants can learn to avoid similar mistakes if they are notified immediately. However, we caution against triggering edit rules too early. This happened during usability testing of the Quarterly Financial Report (QFR). A QFR edit checking the consistency between two items triggered the edit as soon as data were entered for the first of the two items. Participants thought the edit check was ill-placed. This edit check should have been invoked on the second of the two items. We recommend activating the edit check when all the relevant fields have been manipulated, no matter the order of manipulation (Nichols et al., 2000). 3) Research edit checks before implementing an edit that might be too strict. Not all individual records will conform to broad-based edit rules, but they may still be correct. Some participants during the Annual Survey of Manufacturers (ASM) usability testing did not think the edit took in all the relevant factors when calculating ranges for a ratio edit (Saner et al., 2000). In usability testing of the Library Media Centers Survey (LMC), some participants correctly entered a number, which exceeded a range edit check. These participants changed the value to reflect the upper bound of the range. Based on the number of participants who triggered this edit check during usability testing, we determined this range edit was too strict (Nichols et al., 1998). Some LMC respondents began ignoring edit messages when they received numerous edit failures with which they disagreed (Hoffman et al., 1999). 4) Avoid violating user expectations for single-purpose functions. For example, do not mix editing and submitting. Problems arose in the Public School Survey (PSS) because the server-side editing process was invoked when the respondent pressed the “Finished” button, thinking that the form would be submitted. When they saw that an edit check was run and edit messages appeared, they changed their understanding of the “Finished” button. They then believed that clicking on “Finished” again would iteratively check for edits until their forms were correct. This was not the case. Edit checks were only run the first time the “Finished” button was pressed. Although this design was most likely created to ensure respondents invoked the edit checks, the design violated respondents’ understanding twice. Initially it violated their understanding of the word “Finished.” It then violated their expectation of the ability to iteratively check for edits (Bzostek and Mingay, 2001). During usability testing of the Company Organization Survey (COS), respondents were also surprised that the edit report was rerun when they tried to submit (Nichols, 1998). Some Census Bureau

256

The Effect of Data Collection on Editing and Data Quality

CSAQs continue to be designed with editing and submitting as a final verification check. Research is needed to find a less confusing mechanism to run final editing. 5) Allow edit failure reports to be run iteratively, as expected by respondents. In some CSAQs, edit failures for the entire CSAQ can be run as a batch (usually at the end of the questionnaire) and presented together as a list of failures. The batch process of presenting edit messages is not a problem in itself. The problem arises if the CSAQ does not allow this batch processing to be rerun and updated. For example, the PSS was designed for all the edit messages to appear together, at the top of the scrollable form, once the form was submitted. Most likely designers thought respondents would make their way through the list, correcting each one in turn. During usability testing, however, some participants wanted to recheck their form after correcting only one edit, hoping that the list would reappear without the edit they had just corrected. They could not do this with the original design of the PSS CSAQ (Bzostek and Mingay, 2001). In the ASM, we also observed respondents wanting to return to the review screen after correcting a failure. In this case, each time the review screen was invoked, the edit checks were rerun, generating an updated list (Saner et al., 2000). This design met respondent expectations. 6) Allow for easy navigation between an edit failure list and the associated items. In both the COS and ASM, respondents easily navigated from the list of edit failures on the review screen to an item by clicking on the hyperlinked edit-failure text. Once at an item, however, returning to the review screen was confusing (Saner et al., 2000). PSS users wanted to be able to navigate easily back to the list of edit failures, once they were at an item. When they discovered the list was at the top of the form, they complained about having to scroll up to see the list (Bzostek and Mingay, 2001). In the 2000 Orders Survey (M3) CSAQ design, serverside edit checks were run, and the edit messages appeared on a separate page. Users had to exit that page to return to the form and correct their responses. They could not easily navigate between the two pages. Usability experts working on the M3 recommended placing the messages directly on the form, eliminating the need to navigate between two windows (Rao and Hoffman, 1999). The 2002 Economic Census attempted to alleviate this problem by using a separate pane for edit messages. Hyperlinks between the edit pane and the item pane provided the navigation. In theory this design solution satisfies this guideline, but for confirmation purposes we recommend future usability testing of this approach. 7) Clearly identify which edit failure goes with which item. In the PSS, clicking on the editfailure message at the top of the scrollable form reset the page to display the data-entry field that had failed the edit check. The page was reset so that the line containing the data-entry field was at the top, but the question text for this field was above the fold. To see the question text, respondents had to scroll up; thus they had to perform two tasks to find the item associated with the edit failure (Bzostek and Mingay, 2001). In the LMC, the pop-up edit messages were invoked when the respondent’s cursor gained focus in (i.e., went to) another data field. If respondents scrolled down this browser-based scrollable form, the item with the failed edit could be off the screen when the pop-up message appeared (Nichols et al., 1998). 8) Include a location, a description of the problem, and an action to take in the edit message. Respondents were always trying to decipher where the edit was located and what they needed to do to fix it. Every participant for the 2002 Economic Census prototype testing commented that many of the edit messages would have been easier to work with had the item number been available. Participants also wanted to know what action they needed to take to resolve an edit failure. The easiest messages to understand were those that said, “Please complete item x” (Nichols et al., 2001). 9) Avoid jargon, be polite, use good grammar, be brief, use active voice, and use words that match the terminology used in the question. Problems arose when words used in the edit

STATISTICAL DATA EDITING – Improving

Quality

257

message did not match the terminology used in the item. Respondents were not sure they were on the right question (Nichols et al., 2001). Unclear edit messages were also a problem during usability testing for the LMC field test (Tedesco et al., 1999). 10) Prior to implementation, cognitively test any edit messages offering a solution. Offering one of many possible solutions seems to “lock” the respondent into that solution, which may not be the appropriate one. Problems also arose when solutions such as, “Check for a typing mistake” were in an edit message. Sometimes these solutions led respondents astray (Nichols et al., 2001). We noticed participants changing their answer to fit the upper bound of a range edit check in the LMC when the range edit provided the bounds (Nichols et al., 1998). 11) Do not automatically erase data that fail an edit check. In the LMC testing, we tested messages containing an “OK” and a “Cancel” button to close the pop-up edit message window. The edit messages warned participants that their data would be erased by clicking on the “OK” button, but clicking “Cancel” would retain their entries. At least four participants did not understand the difference between the two buttons. We found that when participants’ entries were erased by clicking the “OK” button, some were reluctant to re-enter data (Tedesco et al.,1999). 12) Inform respondents about the policy for submitting data with unresolved edit failures. Respondents in both the ASM and QFR testing were not sure whether they could send data with edit failures remaining, although this was permissible (Saner et al., 2000; Nichols et al., 2000). 13) Give the respondent as much control as possible. One design for communicating edit failure messages is to include them in a pop-up window. In this design, when a respondent finishes entering data, if the data fails the edit, a pop-up window containing the edit failure message automatically appears. Unsolicited pop-up windows containing edit messages caused problems for respondents in usability testing for the LMC Field Test. Several respondents did not read the message but automatically clicked a button in the window to make it disappear. When probed at the end of a testing session, one respondent didn’t even remember a pop-up window. Others thought it was a computer bug (Nichols et al., 1998; Tedesco et al., 1999). The QFR also used pop-up windows to display the edit message, but the participant needed to click on an icon next to the field to invoke the pop-up. Participants used this icon successfully, and could choose when to open and read the edit message (Nichols et al., 2000). In the 2002 Economic Census, the respondent could run edits at any time, which is an example of giving the respondent control. We recommend further usability research on this concept of respondent-initiated editing. 14) Use newly created icons with caution since they do not have universal meanings. Use standard icons only for expected purposes. Both the QFR and the ASM use icons to immediately notify the respondent of an edit failure. A red circle with a white “X” icon was used successfully by the QFR respondents. When the respondent clicked on the icon, an edit message was displayed. The yellow “!” warning messages were rarely clicked on in the QFR testing, and a few ASM respondents were unaware of their clickable property. The white bubble “i” icon was only used in the QFR. When probed, respondents thought the “i” bubble icon meant additional information and were surprised to find the message reminded them to make a remark (Nichols et al., 2000). The standard use for an “i” bubble is to convey information, not to suggest a user action. Users will be confused by standard icons used to mean something different and by ambiguous icons. A flag icon was used in the 2002 Economic Census to indicate an edit failure. There is no current standard icon for edit failures. We recommend usability testing of icons until a standard develops. Standard usability guidelines say to use a text label for any icon. Another option is to use text instead of icons to denote editing functionality or edit failures. The LMC contained a button labeled

258

The Effect of Data Collection on Editing and Data Quality

“Check your work.” If selected, this button would list all the edit failures. Respondents, however, assumed the button simply allowed them to review their work. They did not expect to receive a list of edit failures when they selected the button (Hoffman et al., 1999). 4. DISCUSSION AND EMERGING THEMES We summarize by discussing several themes that emerge from Census Bureau survey practices and research on incorporating interactive editing into CSAQs and Web instruments: The use of edit checks has increased for several reasons over the years. Historically, early CSAQs incorporated only a few basic edit checks because of a grave concern for additional respondent burden, which might result in unit non-response. In addition, early software could support only a few simple edit checks. Over time, more edit checks have been added to existing CSAQs and to newly created CSAQs. Indeed, the ratios of the number of edit checks to the number of questionnaire items presented in Table 1 seem high: A recent Web instrument developed by the Census Bureau, the M3, averages more than two edit checks per item on the questionnaire. This growth has occurred, in part, because of enhancements to the software, enabling the creation of edit checks that were not previously possible. Moreover, the number of edit checks has increased as survey-staffs’ experience and confidence have grown over multiple survey iterations. A reasonable number of embedded CSAQ edits will not necessarily increase respondent burden or lead to unit non-response. Even though the number of edit checks has increased, it appears that embedded CSAQ edits do not necessarily lead to unit non-response. This is corroborated by usability research suggesting that respondents seem to appreciate edit checks, wanting to be informed of discrepancies in their data so they can make corrections. Thus, respondents do not necessarily consider edit checks to be “bad”, and they do not appear to abandon the response task just because they received edit messages. In our experience, computer-literate respondents actually expect some automated checking of their data entries along with a capability to make interactive changes to their responses, and they are surprised if these features are not built into an electronic survey. Only some post-collection edit checks can be embedded in CSAQs. The source of edit checks added to CSAQs is the set of edits typically applied during post-collection processing. For various reasons, however, not all post-collection edit checks can be moved into the interactive CSAQ environment. Programming or technical issues may constrain development of embedded edit checks. For example, not all the edit checks typically performed at headquarters were incorporated into the 2002 Economic Census because the system’s design inhibited implementation of some edit checks. The utility of some edit checks is also limited by “one-size-fits-all” approaches to the design of electronic instruments. Because the correctness of many establishment survey data items depends on the industry, editing parameters vary by industry. CSAQs are not currently tailored by industry or size of business, limiting the value of certain kinds of edit checks. Other edit checks may be too complex to communicate to respondents and too cognitively challenging for respondents to interpret during the course of completing an electronic survey. Moreover, macro-level edits that look at summary results across all respondents can only be done post-collection, and thus cannot be moved into the CSAQ. Mission criticality, typical levels of data quality, and certain respondent characteristics guide the inclusion of edit checks in CSAQs. Because of various constraints, survey managers at the Census Bureau must prioritize edit checks incorporated into electronic surveys. Priorities are placed on items deemed mission-critical. Subject area knowledge of respondents’ abilities to report particular data items and typical levels of response accuracy also guide the definition and selection of edit checks for CSAQs. Respondent acceptance of edit checks depends on several factors, including perceived usefulness. Respondent reaction remains a valid concern. Research shows that, to a great degree, instrument

STATISTICAL DATA EDITING – Improving

Quality

259

control needs to remain with respondents. Usability research suggests a number of guidelines for user-centered design and implementation of CSAQ edit checks to improve the usability of electronic surveys. Operational experience suggests that respondents easily accept edit checks ensuring that the data they enter meet required formats, and these kinds of edits are effective. In addition, different levels of edits– information, warning, and edit failure – provide respondents with information about severity and let respondents choose how to deal with edit messages. Acceptance of electronic forms containing edit-failing data reflects a greater willingness to deal with measurement error rather than to absorb non-response error. Usability research suggests that the issue of respondent control over resolving edit failures is perhaps most critical at the datasubmission stage. Many current Census Bureau CSAQs allow respondents to submit completed electronic survey forms with data that have failed the embedded edits. The main reason for this strategy is to avoid encouraging survey non-response due to unresolved edit failures. All survey programs prefer edit- failing data to no data (unit non-response), and they continue to rely on postcollection editing and imputation to cleanse reported data. Thus, it appears that survey managers are more willing to accept measurement error, than they are to accept non-response error, in the collected data. 5. FUTURE DIRECTIONS AND RESEARCH ISSUES In general, the Census Bureau’s incorporation of interactive edit checks into electronic data collection for economic surveys embodies a conservative philosophy. At a minimum, the Census Bureau receives data from cooperative respondents. Those data may or may not pass basic edit checks. Research is needed to support a more ambitious philosophy, allowing the inclusion of additional post-collection edit checks in electronic instruments in order to reduce costs and increase data quality, while maintaining respondent cooperation. Survey practitioners would very much like to have “generally accepted practices” or “rules of thumb” for resolving electronic survey-design issues, including the open issues in data editing. However, we expect this to be virtually impossible given the variety of surveys, administrations, and trade-offs related to data quality and response. Instead we think it would be more appropriate to develop a set of research-based guidelines to aid decisions related to editing. Derived from goals and principles, and supported by research, these guidelines should be revisited periodically to ensure their relevance as technology changes. Research is needed to determine whether a core set of best practices and heuristics could always be implemented. Issues concerning data quality and resource allocation can arise when large mail surveys offer automated data collection. Large mail surveys have high variable costs (with respect to the number of respondents and the number of survey cycles) associated with data editing because clerks and subject matter experts review edit failures produced by post-data-collection edit checks. On the other hand, editing at the time of data collection, by respondents reviewing edit messages generated by automated edit checks, can have high fixed costs for programming and questionnaire testing; but the corresponding variable costs associated with interactive data editing should be much lower than those for traditional post-data-collection editing. Such a paradigm shift would require modifications to survey organizational cultures, structures, and resource allocation. Survey managers’ preferences for receiving edit-failing data from respondents – as opposed to no data – raise the question of whether “submission with unresolved edit failures” is a satisfactory, cost-effective, “optimum” strategy in terms of data quality, which is affected by both non-response error and measurement error. Investigations into data quality suggest that the potential benefits of CSAQ edit checks are realized, resulting in fewer data items failing post-collection edits (Sweet and Ramos, 1995) and fewer items being changed during analyst review (Evans, 2003). Further research is needed to corroborate this encouraging conclusion, and to evaluate the trade-offs between

260

The Effect of Data Collection on Editing and Data Quality

measurement error and non-response error related to interactive edit checks in electronic data collection. 6. ACKNOWLEDGEMENTS This report is released by the U.S. Census Bureau to inform interested parties of research and to encourage discussion of work in progress. The views expressed on methodological issues are those of the authors and not necessarily those of the Census Bureau. The authors wish to thank the following Census Bureau staff for helping us gather information for this paper: Patrick Kent, Joyce Kiessling, Yvette Moore, John Nogle, Yolando St. George, and Rita Williamson. Material in this paper was previously presented to the Federal Economic Statistics Advisory Committee, March 2003, and to the Annual Conference of the American Association for Public Opinion Research (AAPOR), May 2004. References [1] Anderson, A., Cohen, S., Murphy, E., Nichols, E., Sigman, R., and Willimack, D. 2003. Changes to Editing Strategies when Establishment Survey Data Collection Moves to the Web, Presented to the Federal Economic Statistics Advisory Committee, Washington, D.C., U.S. Bureau of Labor Statistics. [2] Bzostek, J. and Mingay, D. 2001. Report on First Round of Usability Testing of the Private School Survey on the Web, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #42. [3] Dumas, J. and Redish, J. 1999. A Practical Guide to Usability Testing. Portland, OR: Intellect. [4] Economic Electronic Style Guide Team. 2001. “Style Guide for the 2002 Economic Census Electronic Forms,” U.S. Census Bureau, Economic Planning and Coordination Division. [5] Evans, I. 2003. “QFR CSAQ Evaluation.” Internal Memorandum. U.S. Census Bureau, Company Statistics Division. [6] Hoffman, R., Moore, L., Perez, M. 1999. Customer Report for LMCQ Usability Testing, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #23. [7] Nichols, E. 1998. Results from Usability Testing of the 1998 Report of Organization CSAQ, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #19. [8] Nichols, E., Murphy, E., and Anderson, A. 2001a. Report from Cognitive and Usability Testing of Edit Messages for the 2002 Economic Census (First Round), U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #39. [9] Nichols, E., Murphy, E., and Anderson, A. 2001b. Usability Testing Results of the 2002 Economic Census Prototype RT-44401, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #49. [10] Nichols, E., Saner, L., and Anderson, A. 2000. Usability Testing of the May 23, 2000 QFRCSAQ (Quarterly Financial Report Computerized Self-Administered Questionnaire), U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #33.

STATISTICAL DATA EDITING – Improving

261

Quality

[11] Nichols, E., Tedesco, H., King, R., Zukerberg, A., and Cooper, C. 1998. Results from Usability Testing of Possible Electronic Questionnaires for the 1998 Library Media Center Public School Questionnaire Field Test, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #20. [12] Rao, G. and Hoffman, R. 1999. Report on Usability Testing of Census Bureaus M3 Web-Based Survey, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #26. [13] Saner, L., Marquis, K., and Murphy B. 2000. Annual Survey of Manufacturers Usability Testing of Computerized Self-Administered Questionnaire Findings and Recommendations Final Report, U.S. Census Bureau, Statistical Research Division, Usability Lab, Human-Computer Interaction Memorandum Series #30. [14] Sweet, E. and Ramos, M. 1995. “Evaluation Results from a Pilot Test of a Computerized SelfAdministered Questionnaire (CSAQ) for the 1994 Industrial Research and Development (R&D) Survey,” Internal Memorandum. U.S. Census Bureau, Economic Statistical Methods and Programming Division, #ESM-9503. [15] Tedesco, H., Zukerberg, A., and Nichols, E. 1999. “Designing Surveys for the Next Millennium: Web-based Questionnaire Design Issues,” Proceedings of the Third ASC International Conference. The University of Edinburgh, Scotland, UK, September 1999, pp. 103-112. [16] Virzi, R. 1992. “Refining the Test Phase of Usability Evaluation: How Many Subjects Is Enough?,” Human Factors, 34:457-468. Table 1. Summary of Interactive Editing Features in U.S. Census Bureau Computerized Self-Administered Questionnaires (CSAQs)

Survey Program 1

Type of CSAQ

Ratio of edit checks to dataentry fields

Kinds of edit checks 2

Timing of the edit-check messages

Display of edit-check messages

Resolution required to submit?

R&D

Downloadable

67/205 = 0.33

P, R, M

Immediate, Deferred

Review panel

N

M3

Browser-based

103/58 = 2.43

P, B, R, A, M

Deferred

2002 Economic Census

Downloadable

66/95 = 0.69

B, P, L, M, F

Immediate, Deferred

COS

Downloadable

36/23 = 1.57

P, R, L, M, F, RT

Immediate, Deferred

ASM

Downloadable

Unavailable/88

Unavailable

Immediate, Deferred

QFR

Downloadable

29/94 = 0.31

B, P, L, M

Immediate

Highlighted text Icon next to item, review panel Pop-up messages, review panel Pop-up messages, review panel Icon next to item

N Y (for failure on one edit check) Y (for a few key edit-check failures) N N

1

R&D = Survey of Industrial Research and Development; M3 = Manufacturer’s Shipments, Inventories, and Orders Survey; COS = Company Organization Survey; ASM = Annual Survey of Manufactures; QFR = Quarterly Financial Report.

2

B=Balance, P=Preventive, R=Ratio, L=Logical, A=Alphanumeric, M=Missing value/Incomplete, F=Format, RT=Rounding Test

262

The Effect of Data Collection on Editing and Data Quality

EDR AND THE IMPACT ON EDITING — A SUMMARY AND A CASE STUDY By Paula Weir, Energy Information Administration, Department of Energy, United States

1. INTRODUCTION It has been well documented that data editing is one of the most resource intensive aspects of the survey process, if not the most. Much of survey literature and research is dedicated to methodologies, approaches and algorithms focused on defining edit rules and building editing systems with the goal of identify and correcting actual/potential response error. Unfortunately, much less has been written documenting the actual survey performance of the varied editing approaches, and the net effect on data quality. Concerns have frequently been raised that surveys often suffer from overediting which results in increased respondent burden and frustration from edit failure resolution, introduction of error and bias as edits are resolved, and increased survey processing time and cost. While recognition has been given to the prevention of errors through better survey forms design, respondent instructions, interviewer training, etc., these efforts have been limited in their effect. Data editing as a traditional post-collection process has been challenged to some extent by CATI and CAPI surveys, but now, electronic reporting and internet data collection/web surveys have expanded that challenge through self-administered surveys by providing the potential for editing at data capture by the respondent. The presence and the extent of edits in electronic data reporting through computer self-administered questionnaires (CSAQ) via web surveys, downloadable software, and e-mail attachments that are implemented at the initial data entry and capture, versus those implemented in the traditional data editing stage depend on: the amount of development resources dedicated; the sophistication of the electronic option selected; the security of the transmission that is required; the quality of the data that is required; the amount of respondent burden that is acceptable, and the related concern for increased non-response. This paper examines the change in electronic data reporting options and usage from 2003 to 2004 for one statistical agency. One fully web-based survey that recently implemented an editing module is then examined in more detail to better understand the respondents’ views and use of the edit feature. In particular, for this survey, respondents were asked how they used the edit function, as well as, about the clarity and usefulness of the information provided for edit failures. The responses regarding the edit function for this survey is further compared to a study of the edit log which records information each time the edit function is invoked. 2. RECENT PROGRESS IN ELECTRONIC DATA COLLECTION A review of surveys conducted by the U.S. Energy Information Administration (EIA) revealed that electronic reporting on 65 surveys had dramatically increased from 2003 to 2004. The U.S. Government Paperwork Elimination Act of 1998 was an encouragement for Federal statistical surveys to move into more electronic modes, but little progress had actually been made in the first years following the Act. However, the discovery of anthrax, which shut down the main post office for government mail, provided the impetus for change for historically mail surveys. The short-run solution of posting surveys in PDF format on EIA’s web site, along with facsimile return, kept mail surveys operating, but this crisis-based approach did not represent the most efficient electronic collection method. As a result of the perceived threat to respondents, respondents were ready also to accept more electronic modes of data collection, especially methods for which they had already developed a certain comfort level. After the immediate surge in survey responses via unformatted emails and facsimiles, an alternative method was implemented fairly quickly making use of formatted Word files, or Excel files in the survey form image. To encourage secured reporting, a link was placed directly on the electronic survey form that directed the respondent to secured transmission.

STATISTICAL DATA EDITING – Improving

Quality

263

The implementation of the formatted files on EIA’s website was successful because respondents felt comfortable with this option. From the respondents’ viewpoint, this method was convenient, simple and safe. From EIA’s viewpoint, data were received more quickly and forms were more readable. Total or net values were calculated as the respondent entered data on the spreadsheet, so some potential errors were avoided, but very little editing at collection was attempted beyond integrity checks to insure loading of the data to a database. These mostly included automatic totals, checks on field lengths, numeric vs. character, or valid codes (state, month, etc). Despite these limitations, the ease of implementing this option resulted in the number of surveys offering this option increasing 17% from 2003 to 2004. Surprisingly though, of the surveys offering both secured and unsecured transfer options, 86% of the surveys had more respondents choosing unsecured transfer in 2003. But, the number of respondents choosing secured has grown as the number of surveys offering secured transfer has increased approximately 52% in 2004 (from 27 to 41 surveys), as shown in Table 1. Yet, for those surveys that had previously offered secured transfer, only a few of the surveys experienced a large increase in usage of secured transfers, ranging from 18 to 54%, while roughly half of the surveys had more modest increases, and two surveys experienced decreases (ranging from 10 to 20%) in secured transmission usage. This finding is interesting in view of the frequent reference that security concerns are respondents’ primary concern about reporting via the Web. While this electronic option of formatted files on the web has been appealing to the respondents, the benefits have also been limited because of the complexity of data capture. Although some of the surveys utilize a Visual Basic conversion and SQLLLOADER to an Oracle database, many surveys continued to print the electronic responses, and re-key the data into the respective survey processing systems/databases, potentially introducing new errors. This along with limited editing capability has restricted the benefits to the agency of these electronic forms. Internet data collection (IDC), using a browser-based approach, has become the alternative, requiring more resources to develop, test, and implement. The usage of this reporting option for surveys that offered the option of IDC has been steadily growing as more surveys have provided this option. Most of this growth has occurred in the last year with 20 surveys offering IDC in 2004, compared to 12 in 2003. More importantly, the percent of IDC respondents choosing the IDC option for those surveys has significantly increased. One survey offering this option for the first time realized a 51% usage by respondents, while the other surveys with the IDC option showed an average increase in usage of approximately 40%. One particular series of surveys made a concerted effort to increase electronic reporting, resulting in the achievement of usage rates greater than 50% across their IDC option surveys. The IDC surveys have successfully incorporated editing by respondents using server-side information, such as, respondent’s previous period’s data. Fatal edits are clearly the most commonly implemented, driven by database requirements. Edit rules that depend on fixed values or within form reported values are the next most commonly implemented. Edit rules that depend on external values, such as, previous period’s report require that data be accessible to the respondent at data capture, or quickly returned to the respondent in an interactive mode. Therefore, these edits are more resource intensive to implement and require that security concerns be addressed for confidential data. The surveys with an IDC option vary in approach to the respondent’s requirement for edit failure resolution. Some require data correction or edit override with comment (hard edits), while others require no response or action from the respondent (soft edits).

264

The Effect of Data Collection on Editing and Data Quality

Table 1. Electronic Usage 2003 and 2004

Electronic Method

Unformatted e-mail Unsecured transfer Word or Excel file Secured transfer Word or Excel file Diskette/CD software (e-mail, fax or mail back) PEDRO (mail CD, install and electronic submission) and download software Internet

Number of surveys Number of surveys Change in using method and using method and Editing within surveys using range of % range of % Data Capture? electronic method from respondents using respondents using collection? 2003 to 2004 method (2003) method (2004) 5

11

10-90%

.35-100%

39

36

1-100%

.17-80.8%

27

41

1-70%

.16-55%

4

9

3-57%

1-100%

23

23

1-27%

2.5-51.4%

12

20

2-99%

.1-100%

120.00%

no

no

-7.69%

no

51.85%

some

125.00%

only if diskette is mailed back

yes

0.00%

yes

simple

66.67%

yes

yes

Only totals simple

Despite the increased usage of editing at the data reporting phase, editing is still being performed in the traditional data processing stage for not only non-IDC respondents, but also across respondents from all reporting modes for edits requiring integration of responses, as well as for IDC respondents by-passing the (soft) edit failures. It is important in the survey process that the edits performed are consistent across collection modes, and, that data from all collection modes are integrated and higher level edits performed across respondents, or across surveys as appropriate. This is necessary to optimize the editing process, in an attempt to not only prevent error in the most effective manner by exploiting the respondents’ knowledge at data capture, but also continue to draw on the more comprehensive information available for validation at post-collection. Some balance of the two phases of editing is viewed as optimal for improving efficiency and data accuracy without negative side effects on response rates, particularly for mixed mode data surveys. 3. CASE STUDY Editing in Internet surveys has brought about a new set of issues and concerns. In addition to the traditional problem of determining the most effective and efficient edit rules, the internet survey edit process has to address how and when to invoke the edits to maximize data quality and minimize respondent break-off. Should the edits be performed after each data item is entered or after all items have been entered? Should hard edits be used requiring the respondent to take action, or soft edits to alert the respondent but require no action? How should the edit failures be presented—in a separate window or directly on the data entry/survey instrument screen? Should the edit failures be presented one at a time or all together? Edit messages take on a different role than in traditional editing, communicating directly with the respondent and taking on the role or “social presence” of the interviewer in resolving an edit failure. These messages need to be written in simple, nonconfrontational language, and convey meaning to which the respondent can relate and take the appropriate action. How much information should be conveyed with the edit failures?

STATISTICAL DATA EDITING – Improving

Quality

265

One fully web-based survey that recently had implemented an editing module was examined in terms of the how the respondents used the edit feature in reporting. In this survey, State officials are the respondents who reported prices charged by the sampled businesses operating in their State. The overall reaction to the new edit, which identified businesses whose price change since the previous report was out of range, was very positive. The edit module was intended to be invoked by the respondent after all the data for that period had been entered. The system required the edit to be run prior to submission, but did not require the respondent to make any changes to the flagged data prior to submitting the data. Respondents were encouraged to provide comments on the edit failures but were not required to do so. The system, however, actually allowed the respondents to “Review Prices Before Submit”, thereby running the edit, at any point in their data entry. This was an area of concern because the edit rule was based on the mean change of all the prices that were entered, and different expectations for editing would result if run on partial data. After respondents had used the edit feature in the IDC for four reporting periods, they were sent a questionnaire asking five basic questions regarding the new function exploring: 1) when they invoke the price review/edit; 2) the process they use once the review screen is displayed (ignore, recall companies, etc); 3) their understanding of the information provided on the review screen; 4) the navigation between the review screen and the main survey screen for error correction; 5) other comments or suggestions regarding the edit function and the review screen.

Figure 1. Main Screen

266

The Effect of Data Collection on Editing and Data Quality

Figure 2. Review Screen

Table 2. Edit Function Questionnaire Results

In order to better understand the process flow, two screen prints are provided. The first screen (figure 1) shows the data entry screen, the main screen. The respondent selects a company from the dialogue box on the left, and the company’s information appears on the right side of the screen along with the boxes to enter the company’s price and category of sale. After entering a price, the respondent selects the next company from the left dialogue box, enters their price, etc. Once all the companies’ prices have been entered, the respondent clicks on the “Review Prices Before Submit” button located on the center bottom of the screen. When the respondent clicks on this review button, the edit failures are displayed in the second screen (figure 2) shown. On the left side of the screen, the companies whose prices failed the edit are displayed. As the respondent clicks on a company, information about that company is displayed to the right. This information includes the mean price change for all companies in that State, the previous and current period’s price, and the price change since the last period for the company selected, and the price interval the company was expected to fall within for the current period. To make a change to the data, the respondent clicks on the “Back to Main Form” button shown in the figure at the bottom right to return to the data entry screen. If no data changes are needed, the respondent clicks on the “Submit to EIA” button at the bottom left to send the data to EIA. Of the respondents who returned the questionnaire, most respondents (86.7%) indicated that they ran the edit after all prices had been entered, just prior to submission of the data, as shown in Table 2. However, a few respondents (12.5%) indicated that they invoked the edit when an individual company’s data seemed anomalous. Similarly, a few respondents indicated they used their own method outside the system to review the data, frequently making use of a longer historical series than just the previous period, or compared price changes to a fixed amount they had set. The respondents also varied as to their process for reviewing the prices flagged by the IDC edit.

STATISTICAL DATA EDITING – Improving

Quality

267

Table 2. Edit Function Questionnaire Results

Q1. When invoked (can check more than one): After each entry After some but not all entries After all entries Other Q2. Process for review flagged entries (can check more than one): Ignore Review one-by-one, calling and correcting Review by noting outside system, call all, correct all, review again, then submit Review by noting outside system, call all, correct all, then submit (no final review) Other Q3. Understand: Why failed Mean change Mean includes only entered data Flagged company information Expect company value Q4. Is navigation to/from review screen a problem? No

0% 12.5% 86.7% 20.0% 25.0% 25.0% 25.0% 6.7% 25.0% 80.0% 80.0% 73.3% 86.7% 86.7% 100%

Finding One: Approximately 25% of the respondents reported that they ignored the information provided regarding the failed prices. Another (25%) reviewed the information for each failed company one at a time, verifying the information and correcting as necessary by returning to the main screen, before proceeding to the next flagged company. On the other hand, 31% of the respondents make note (outside of the system) of the edit failure information provided and then follow-up. Of particular concern is the finding that 25% of the respondents ignore the edit, and 7% perform no second review of edit failures after corrections are made in the main file, prior to submitting the data, despite the fact that that the re-edit information is presented to them on the submit (review) screen. Finding Two: Also of interest is the finding that 87% of the respondents understand the information regarding the particular company’s price that failed, and, in particular, the information regarding the company’s reported price this and last period, and also understand the expected interval for the company’s price for this period. Yet, only 80% understand why the price failed the edit, and understand the information on mean price change of all companies they report for. Furthermore, only 80% understand that the mean price change represents only the prices that were entered at the point at which the review button was hit. These last two findings can be used to further interpret the 87% --87% understand that the price is expected to fall within the specified interval, but a few of those respondents do not understand why the price should be within the interval, and therefore, do not understand why the price failed. Finding Three: All the respondents reported that the navigation between the review screen and the main form (in order to correct a price) was not a problem for them. This finding was curious in view of the first finding that 20% of the respondents indicated that they flip back and forth between the main screen and the review screen for each company (after having entered all or most of the data), and that 33% of the respondents record information from the review screen to another location outside the system to further evaluate the data. Each time the respondent selects the “Review Prices Before Submit” button, the edit failures that are displayed are also written to an error log. This log records for each edit failure the individual company’s data, the reference period, the time the edit was invoked, and whether the price failed as

268

The Effect of Data Collection on Editing and Data Quality

too high or too low. This log was analyzed for the first twelve reference periods. A summary of the edit failures written to the log is shown in Table 3. As mentioned previously, each time the respondent clicks the “Review Prices Before Submit” button, each of the edit failures is written to the error log, regardless of whether the edit failure had been written before, as long as the price still failed the edit rule. Therefore, the number of failed records shown in Table 3 reflects the number of times the button was hit times the number of failures at that time. Even though this created a large number of virtually duplicate records written, that characteristic of the log made it useful for tracking and measuring the respondents’ process flow. The number of first time failures shown in Table 3 was derived from the log to measure the set of unique edit failures. If the same set of failures occur each time the review button is clicked, then the ratio of the number of failed records to unique records indicates the number of times respondents went to the main screen from the review screen, and returned to the review screen. Across both products, respondents clicked on the review button an average of 5.7 times per reference period, but the rate by respondent and product varied from a low of 1.5 times to a high of 11.8 times review clicks, averaged across reference periods. These respondent rates for changing screens were further compared to the respondent average rate for edit failures to shed light on the respondent’s process for invoking the edit and resolving the edit. Table 3. Summary of Edit Failure Log

PRICE PRODUCT HI or LO

Data

# Failed records # Changed and fail again # First time failures Heating Oil # Failed records L # Changed and fail again # First time failures Heating Oil # Failed records Heating Oil # Changed and fail again Heating Oil # First time failures # Failed records H # Changed and fail again # First time failures Propane # Failed records L # Changed and fail again # First time failures Propane # Failed records Propane # Changed and fail again Propane # First time failures Total # Failed records Total # Changed and fail again Total # First time failures H

Total 3478 1 584 4410 5 728 7888 6 1312 3841 19 705 2722 13 504 6563 32 1209 14451 38 2521

Avg/ wk. 289.8 48.7 367.5 60.7 657.3 0.5 109.3 320.1 1.6 58.8 226.8 1.1 42.0 546.9 2.7 100.8 1204.3 3.2 210.1

Avg. # Screen Screen Changes/ Avg # Failures Changes Failure 13.2 0.0 2.2 6.0 2.7 16.7 0.0 2.8 6.1 2.2 29.9 0.0 5.0 6.0 1.2 13.3 0.1 2.4 5.4 2.2 9.5 0.0 1.8 5.4 3.1 22.8 0.1 4.2 5.4 1.3 50.2 0.1 4.4 5.7 1.3

The average respondent failure rate across both products was 4.4, compared to the 5.7 screen change rate, but again, the failure rate varied by product and respondent from a low of 1.4 failures to a high of 10 failures averaged across reference periods. The average number of screen changes per edit failure is displayed in the last column of Table 3. This shows a rate of 1.3 screen changes per failure overall, indicating that on average, the respondents return to the main screen one or more times per failure. This finding at first would lead one to think the respondents are viewing the edit failures one at a time and returning to the main screen to apply a correction. However, in general, this is not the case,

STATISTICAL DATA EDITING – Improving

Quality

269

because when the respondent returns to the review screen, the same edit failures with no data changes appear for the most part. What the log of failures can not show, however, is the possibility that respondents return to the main screen and enter new data that do not fail nor impact the previously failed data sufficiently to change their edit failure status. The screen change rates per edit failure were also compared to the findings from the questionnaire. Comparing the screen change to failure rates to the responses to question 2, we find for the respondents who answered that they ignored the information provided on the review screen, their screen changes to failure rates, lower than most, as determined from the log ranged from .2 to 1.3 failures per response period, across the products. Clearly, these respondents do not go back and forth from review to main for each edit failure, but they are using the review button more than would be expected, given that they said they ignored the information. Similarly, we can compare information for the other extreme of respondents who reported on the questionnaire that they reviewed the information for each failed company one at a time, verifying the information and correcting as necessary by returning to the main screen, before proceeding to the next flagged company. This would imply screen changes to failure rates of 1.0 or greater. These respondents would be expected to have the largest screen changes to failure rates. The log showed that one of these respondents had a rate less than expected of only .6, but the remaining respondents ranged from 1.2 to 3.0 screen changes per edit failure across the two products. Those respondents that reported making a note (outside of the system) and then following-up on the edit failures had, in general, the highest rates, ranging from 1.1 to 5.1 screen changes per failure across the products. The concern regarding the finding that only 80% of the respondents to the questionnaire understood that the mean price change represents only the prices that were entered at the point at which the review button was hit was investigated. Review of the individual respondent logs for the 20% that did not understand showed that most of these respondents (75%) had very high screen changes per edit failure and showed evidence of editing data based on partial information, thereby effecting the edit rule and resulting failures. The logs for these respondents did not show the same set of edit failures each time the review screen was clicked. As expected, this was particularly true of the subset of respondents who had also answered on the questionnaire that they invoked the edit after each/some entries were made, but not all entries. Further study of the edit failure log by reference period by create date revealed that a substantial number of the records were created after preliminary estimation by the survey, and some even after final estimation (one week after the preliminary estimate). The separation of failures by preliminary and final estimates’ dates is now highlighted as an area of future study. Future research will also include an examination of the logs to determine the efficacy of the edit rule, and examination by company to highlight repeat failure companies to discover whether the data are repeatedly reported erroneously, or the edit rule is not appropriate for them. 4. CONCLUSIONS While substantial progress has been made in EDR, both in the number of surveys providing the option and the increased usage of the option by respondents, significant work remains in developing and EDR editing strategy. The strategy must recognize the balance of possibly conflicting quality goals of maximizing the use of the EDR option by respondents, and minimizing the errors on the submitted data. The EDR strategy must take into account when/if to use hard or soft edits, when to invoke the edit in the data entry process, how to present messages regarding the edit failures, and how to navigate efficiently to correct errors and/or submit the data. The use of cognitive testing and respondent interviews/questionnaires are useful in designing an EDR editing strategy or improving an EDR editing approach already in use.

270

The Effect of Data Collection on Editing and Data Quality

EDR expands the “self-administered” role of the respondents to include their interaction with the edit process. As a result, new indicators on the performance of the edit process must be constructed and analyzed. As demonstrated in this case study, logs generated by the respondent actions are useful not only in measuring the performance of the edit rules, or to validate information obtained by cognitive testing or questionnaires, but just as importantly, as hard evidence on how the respondents actually use the edit process. This case study demonstrated that understanding the respondent process of how and when the edit is invoked, and how and when the failures are resolved may impact the resulting edit failures, which in turn effect the quality of the data. References [1] Best, Sam (2004): “Implications of Interactive Editing”, FCSM-GSS Workshop on Web Based Data Collection, Washington, D.C. [2] Nicholls, W.L. II, Baker, R.P., & Martin, J. (1997): “The Effect of New Data Collection Technologies on Survey Data Quality” in L. Lyberg, P. Biemer, M. Collins, C. Dippo, N. Schwarz, & D. Trewin (editors) Survey Measurement and Process Quality. New York: Wiley. [3] Weir, Paula (2003): Electronic Data Reporting—Moving Editing Closer to the Respondent, UN/ECE Work Session on Statistical Data Editing, Madrid. [4] http://www.eia.doe.gov/oil_gas/petroleum/survey_forms/pet_survey_forms.html [5] http://www.eia.doe.gov/cneaf/electricity/page/forms.html [6] http://www.eia.doe.gov/cneaf/electricity/edc/contents.html

STATISTICAL DATA EDITING – Improving

Quality

271

EDR IMPACTS ON EDITING By Ignacio Arbues, Manuel Gonzalez, Margarita Gonzalez, José Quesada and Pedro Revilla, National Statistics Institute, Spain

1. INTRODUCTION Electronic Data Reporting (EDR) and, in particular, Web surveys offer the opportunity for new editing strategies. Moving editing closer to the respondent can significantly contribute to improving editing effectiveness. We can go a step further by integrating the respondents into the editing processes. Electronic questionnaires offer new possibilities of moving editing closer to the respondent. The possibility of using built-in edits allows respondents to avoid errors as they are made. The elimination of data keying at the statistical agency directly gets rid of a common source of error. Hence, some of the traditional editing tasks could be reduced. Many statistical institutes are offering electronic questionnaires as a voluntary option in order to improve the efficiency of statistical processes and to reduce respondent burden. Hence, a mixed mode of data collection (partly paper, partly electronic) is used. Global strategies should be designed, because data editing strategies may differ when using paper than when using an electronic questionnaire. Some crucial questions arise: What kind of edits should be implemented on the electronic forms? How many? Only fatal edits or fatal edits and query edits? What kind of edits should be mandatory? Like many other statistical institutes, the National Statistical Institute of Spain (INE) has a significant interest in Web-based data reporting. An example of this was the possibility offered to all citizens to fill in the Population Census 2001 using the Internet. The INE is working on a general project of giving reporting enterprises the option of submitting their responses to statistical surveys using the Internet. A major target of this project is to offer the reporting enterprises another option to fill in the questionnaires, in the hope of reducing respondent burden, or, at least, improving our relationship with them. While there are a lot of expectations about the role of Web questionnaires in the years to come, the use of Web questionnaires is often lower than expected. More research is needed to look for the reasons why the rate of using electronic questionnaires is quite low, while technical conditions are available for many of the respondents. Probably, the electronic forms do not have as many advantages for the respondents as for the statistical offices. For this reason, encouraging the use of Web questionnaires by respondents is a key issue. Several methods can be used. For example, explaining the benefits to the respondents or considering statistical Web questionnaires in a wider context of all administrative duties and all EDR (e-commerce, e-administration, etc.). Given incentives (temporary access to information, free deliveries of tailored data) is another method to increase the take-up of Web questionnaires. This paper explores the possibilities of Web questionnaires in order to reduce editing tasks. The combination of built-in edits and selective editing approach appears very promising. Our target for the future is that, after implementing correct Web edits, no traditional microediting will be needed. Some practical experiences in the Spanish Monthly Turnover and New Orders Survey are presented. For this survey, we offer tailored data from the Web. When an enterprise sends a valid form, it immediately receives tailored data from the server. Taking this advantage into account, we expect more enterprises to use the Web survey. In the following section, the challenges and opportunities of electronic questionnaires are discussed. In section 3, some practical experiences are presented. The paper concludes with some final remarks.

272

The Effect of Data Collection on Editing and Data Quality

2. CHALLENGES AND OPPORTUNITIES OF ELECTRONIC QUESTIONNAIRES Web surveys offer new opportunities on moving editing closer to respondents. Whereas Computer Assisted Interviewing (CAI) integrates into one step previously distinct phases such as interviewing, data capture and editing, Web surveys go a step further by shifting such activities to the respondent. Hence, Web surveys offer the opportunity for re-engineering editing processes, in a way reporting enterprises may play a more active role in data editing. Many statistical offices are experimenting with the use of different EDR options in data collection. Web surveys offer some advantages over other more complex EDR methods. The Web is a mature technology for EDR because of widespread public acceptance in enterprises and institutions (and increasingly, also in households). The prerequisites are only a PC, access to the Internet, and a browser. There is no need, in principle, to incorporate other software on the reporting enterprises. The Web makes it simple to put electronic forms at the disposal of almost every enterprise, whatever its size. Several advantages could be expected from using Web surveys. These include improving accuracy and timeliness, and reducing survey cost and enterprise burden. Improving accuracy results from built-in edits, which allow the reporting enterprises to avoid errors as they are made. The elimination of data keying at the statistical agency directly gets rid of a common source of error. Moreover, this elimination of data keying reduces the processing time of the survey. There are other factors that can also contribute to improve timeliness. Data transfer on the Web can be done much faster than using the postal system. Some electronic devises (automatic data fills and calculations, automatic skipping of no applicable questions, etc.) could help the respondent to fill in the questionnaire faster. The cost for statistical offices to carry out a survey using the Web could decrease. Savings could be achieved from reducing storage, packing, postal charges and eliminating data keying and keying verification. Some of the editing task could be reduced from built-in edits. Nevertheless, to get the target of reducing enterprise burden using Web surveys is not so straightforward. The reduction in the enterprise burden is not always obvious. The respondents' benefits depend largely on the way metadata support the respondent in filling in the questionnaire (help texts, auto-fill rules, pre-filled data, etc). In any case, the respondents' benefits need to be clearly explained to convince them to use the Web questionnaire. An important element to improve the acceptance of Web surveys among reporting enterprises is to consider Web questionnaires in a wider context of all their administrative duties and of all electronic data reporting. It is unlikely that reporting enterprises are willing to adapt their systems only for statistical purposes. Hence, statistical offices should be aware of the habits of respondents and try to adapt electronic questionnaires to these trends (for example, e-commerce, e-administration, etc.). There are a lot of expectations about the role of Web surveys in the years to come. Nevertheless, the implementation of Web surveys and other EDR methods in enterprise surveys (and, even more, in household surveys) has often been lower than expected. The take-up of electronic data reporting for statistical data by business providers is generally less than 10%, and often less than 5% (Branson 2002). Other studies also find low rates of response via Internet. For example, Grandjean (2002) finds a rate of 18% for a survey used to construct the Index of Industrial Production in France. Different rates are found in this study by enterprise size (higher for large enterprises than for small and medium ones) and by sectors (for example, electronic and electric industries more than the average, furniture industries less than the average). In another study, Mayda (2002) finds a rate between 5% and 25% in two quarterly surveys on business and agriculture in Canada. Even though the usage of the electronic option by respondents has increased lately (for example, Paula Weir, 2005) it still leaves room for improvement. More research is needed to look for the reasons why, up to now, the rate of using EDR is quite low, while technical requirements are available for many of the respondents. Probably, electronic

STATISTICAL DATA EDITING – Improving

Quality

273

forms have not the same advantages for the reporting enterprises than for the statistical offices. For many of the questionnaires, the most time consuming tasks are to look for the required data and computing the answers. There is no time difference between keying data on a screen and to fill in a questionnaire on paper. The advantages for the reporting enterprises would probably be bigger if the information could be extracted straight from their files. But this procedure may be expensive for both reporting enterprises and statistical agencies, because an initial investment is needed. In any case, for most of the surveys, it is clear that, at the moment, EDR cannot be the only way of data collection. Paper data collection and associated procedures (like scanning) are probably going to stay with us for some years. Hence, a mixed mode of data collection (partly paper, partly electronic) should be used. Global strategies should be designed, because data editing strategies differ when using paper to an electronic questionnaire. There are two contradictory targets – on the one hand, to implement a single point of entry for all agency surveys, with a uniform security model and a common look across the entire site and, on the other hand, to allow decentralized applications to cope surveys singularities. One aspect where the difference among surveys has to be taken into account is data editing. Combining the two targets (i.e. integrating a centralized platform with decentralized applications) is a non-trivial task. Some crucial questions arise: What kind of edits should be implemented on the Web? How many? Only fatal edits or fatal edits and query edits? What kind of edits should be mandatory? When should the edits be performed? After each data item or after the whole form has been sent to the server? On one hand, we need to include some edits. If we do not, then the information collected by a Web survey should be treated to the editing procedures in exactly the same way as collected by paper. In that case, we would lose an essential advantage of Web surveys: no need to editing again the information with a suitable set of edits implemented in the Web application. On the other hand, we need to be extremely careful with the set of edits to be implemented in the Web survey, because if we implement a big set, then respondents will give up and prefer the freedom they have in paper. Too many edits could even irritate the reporting enterprises and increase the burden. In that case we will lose all the advantages of Web surveys, as users will prefer the easy way (paper). How to cope with the too few/too many edits dilemma? If we are trying to implement a Web questionnaire in an existing survey, a way is to analyse the current set of edits in order to determine the efficient set of edits to be used in the Web implementation. Hence, the implementation of new procedures obliges to the revision and redesign of the current procedures of the survey. But we should make that revision from the user's point of view. Otherwise, it would be impossible to find out if the users are going to get fed up with the task of filling in a Web form or not. It must be stressed that making that sort of analysis is strictly necessary in order to implement a suitable set of edits that will not discourage users and that will make possible not to edit the Web information in the traditional paper way. In order to achieve this target an analysis similar to that of Martin and Poirier (2002) should be carried out. It is important to have procedures allowing access to versions of data and additional processing metadata that describe how the data were transformed from collection to dissemination. 3. SPANISH EXPERIENCES ON WEB SURVEYS: THE TURNOVER AND NEW ORDERS SURVEY Like many others statistical agencies, the INE has a significant interest in Web-based data reporting. An example of this was the possibility offered to all citizens to fill in the Population Census 2001 using the Internet. The INE is working in a general project of giving respondents the option of

274

The Effect of Data Collection on Editing and Data Quality

submitting their responses to statistical surveys using the Internet. A major target of this project is to offer the respondents another option to fill in the questionnaires, in the hope of reducing respondent burden, or, at least, improving our relationship with them. Moreover, an ad hoc prototype Web system to collect establishment data for the Turnover and New Orders Survey is also being implemented. This monthly survey uses a very simple form. Many problems using the Internet might be due to the various configurations and products installed on the respondents' machine. Each respondent's computer can have different components, different versions of operating systems and browsers, and different modem speeds. For this reason, and from using a very simple form, all the programs are going to be run through the server, without the need to install any software on the respondents' computer. The Web form is being offered to the sample of reporting enterprises as a voluntary option to respond to the Survey. We think that, probably, many enterprises will not change to the Web form. For this reason, we offer tailored data to them. When an enterprise sends a valid form (i.e. passing the mandatory edits), it immediately receives tailored data from the server. These tailored data consist of tables and graphs showing the enterprise trend and its position in relation with its sector. Offering this data through the Web has some advantages (speed, possibility to edit the file) over sending this same data on paper by mail. Taking these advantages into account, we expect more enterprises to use the Web survey. A. Editing in the New Orders/Turnover Industrial Survey A monthly survey provides data to publish the Short Term Indicators of Industrial New Orders and Turnover. The sample size is of about 13,200 local units, for which 14 variables are requested: •

Orders: Total new orders, domestic market, eurozone non-domestic, European Union noneurozone, non-EU, stock at the beginning of the month, cancelled orders, orders invoiced and stock at the end of the month.



Turnover: Total turnover, domestic market, eurozone non-domestic, European Union noneurozone, non-EU.

Hereafter, the variables Total Turnover and total New Orders will be referred as ‘main variables’. Since these variables are used for the computation of the indices, they receive a different, more exacting, editing process. When in the future, new indices are calculated, e.g. for the different markets, more variables will have the same consideration. The data is collected in three different ways: a little more than 90% of the sample is collected by the local offices of the INE, about 8% is obtained directly by the national headquarters and from January 2005 a web reporting system (WRS) is working experimentally.

STATISTICAL DATA EDITING – Improving

275

Quality

INE local office 1.1 First Editing

INE HQ 3.1 Manual editing 3.2 Confirmation

INE HQ 1.2 First Editing

INE HQ 2. Selective editing

INE HQ

WEB reporting

3.3. Clean file

1.3 First Editing

Figure 1 Microediting scheme

B. First editing of paper questionnaires The processes 1.1 and 1.2 are very similar, since the edits implemented are the same. The main differences are due to the closeness of the 1.2 editing team to the people in charge of the calculation of the indices and the fact that the same team performs also phase 3. Thus, the result of 1.2 is intended to be more reliable, since the team receives a more direct feedback from the computations and analysis done with the edited microdata. Moreover, the team in 1.2 is more specialized (working only in this survey) and more qualified (university degree). The interest of having this data collecting channel in the INE headquarters is to guarantee the quality of the data from some enterprises of great importance for the computation of the indices. Two kinds of edits are distinguished in phases 1.1 and 1.2: type I and type II. I edits are mandatory, checking that: • • • •

There are no non-numeric characters and the values are within the range [0,9999999999]. The upper value is due to the format of the files used for transmitting the data. Negative values are not allowed. All the variables are reported. The market desegregation sums up to the main variables. The stock at the end of the month equals the stock at the beginning plus new orders received, minus orders cancelled, minus orders invoiced.

II edits are not mandatory. The only effect is that the person in charge of recording the questionnaire is forced to type a remark. These edits check that: •



The stock of orders at the end of the month equals the stock at the beginning of the next month. This edit was decided to be non-mandatory because some enterprises argued that due to price changes, the value of the stock is recalculated when passing from one month to another. The change rate of any variable is within a defined range. The rates are computed comparing the current value with those of the previous month and the same month of previous year. The intervals are described in Table 1. It should be noted that they are very narrow, so there is an important risk that the frequency of errors, most of them unnecessary, makes the staff keying the questionnaires to type standard remarks which are of no use to the subsequent process.

276

The Effect of Data Collection on Editing and Data Quality

Variable Total new orders Domestic market Eurozone non-domestic EU non-eurozone Non-EU Stock at the beginning Orders cancelled Orders invoiced Stock at the end Total Turnover Domestic market Eurozone non-domestic EU non-eurozone Non-EU

Previous month Min Max -30% +30% -50% +50% -50% +50% -50% +50% -50% +50% -30% +30% -30% +30% -30% +30% -30% +30% -30% +30% -50% +500% -50% +500% -50% +500% -50% +500%

Previous year Min Max -30% +30% -50% +50% -50% +50% -50% +50% -50% +50% -30% +30% -30% +30% -30% +30% -30% +30% -30% +30% -50% +500% -50% +500% -50% +500% -50% +500% Table 1

Table 1

C. First editing in the web reporting system The process in the web reporting system 1.3 is somewhat different. The mandatory edits are the same but for the fact that some variables in the market desegregation are allowed to remain blank. In this case, the system assumes that their real values are zero (which implies that the remaining variables sum up to the total value reported). There is a strongly reduced list of informative errors, since the change rates are checked only for the main variables. The ranges also differ between the annual and monthly rates and are much broader. The larger width of the monthly intervals is due to the seasonality of some series. A further difference is that in this case, the remarks are optional. The ranges are described in Table 2.

Variable Total New Order Total Turnover

Previous month Min Max -99% +9800% -99% +9800%

Previous year Min Max -90% +4000% -90% +4000% Table 2

Table 2

D. Selective editing The questionnaires from the 1.1, 1.2 and 1.3 proceed to phase 2. Some questionnaires are selected for manual editing according to the following criteria: • •

Strong variation from the previous values. Influence on the value of the branch indices, i.e., the value of the index is very different depending on whether the microdata is used in the calculation or not.

E. Variation At this stage, the microdata files are augmented with two flags indicating whether the variation rates of the main variables exceed some limits. There is one flag for each main variable which activates if any of the ranges is exceeded. The limits are the ones in Table 2. In fact, these

STATISTICAL DATA EDITING – Improving

277

Quality

ranges were applied to the web reporting system after being used for selective editing. One aim of the experiment is to know if the fact that the reporter is warned of the values exceeding the limits allows a relaxed later editing. When the flag is activated, the microdata is not used in the computation of the index unless it is validated in the phase 3 of the editing scheme. Otherwise, the data is considered as validated, so it is used in for the computations, unless its validity is revoked in phase3. F. Influence The influence is computed in a different way depending on whether the variation flag is activated or not. •

In the first case, the aim of computing influence is to know the effect of validating the microdata. Thus, we compare the monthly rate of the branch with and without the microdata. By compaing the variation rate instead of the index, we remove the effect of the index level, which can be very different between activities.



In the second case, the influence has a further function, that is to detect anomalous values. Thus, due to the seasonality of many series, it proved more useful to compare the annual rate on the branch with the one obtained removing the value under analysis from the current total (month t) and removing the value of previous year to the corresponding total (month t12).

The first influence is easy to calculate. We compute the variation rates:

)t

∑x = ( ∑x

i, j

VM

j

t −1 i, j

j

EQ1 EQ 1 • •

VkM =

) xit,k + ∑ xit, j j

t −1 i ,k

x

( + ∑ xit,−j1 j

V M is the monthly rate of variation with all data and VkM is the rate obtained adding the value under analysis. ) xit, j is the value of unit j in activity i at time t, among the validated units common to time t-1.



( xit, j is the value of unit j in activity i at time t, among the validated units common to time t+1.



xit,k is the value of the (non-validated) unit under analysis k in activity i at time t. The influence thus, is computed as:

(

I kM = 100 × VkM − V M EQ2 EQ 2

)

The influence measure used for the validated microdata involves more complex calculations. The following formulae are used:

278

The Effect of Data Collection on Editing and Data Quality

12

V

A

) t − s +1

∑x =∏ ( ∑x

)t

i, j

j

t −s i, j

s =1

VkA

j

EQ 3 EQ3

) ) ⎛ 11 ∑ xit,−js +1 ⎞ ∑ xit,−j11 ⎜ ⎟ j j ⎜ ( t −s ⎟ ( t −12 t −1 ∏ x i , j ⎜ s = 2 ∑ i , j ⎟ ∑ xi , j j ⎝ ⎠ j≠k

∑x = ( ∑x j ≠k

j

i, j



Where: V A is the yearly rate of variation with all data and VkA is the rate obtained removing values as explained above. ) xit, j is the value of unit j in activity i at time t, among the validated units common to time t-1.



(t xi , j is the value of unit j in activity i at time t, among the validated units common to time t+1.



Thus, the influence can be computed as:

(

I k = 100 × V A − VkA

)

EQ 4 EQ4 G.

Interpretation of the influence We can express the monthly influence in a different form:

⎞ ⎛ ⎞ ⎛ t ) ) ⎜ xi ,k + ∑ x)it, j ⎟∑ x(it,−j1 − ⎜ xit,−k1 + ∑ x(it,−j1 ⎟∑ x)it, j ⎛ xit,k + ∑ xit, j xit, j ⎞ ∑ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ j j j j ⎠ j ⎝ ⎠ j ⎝ 100 I kM = 100 × ⎜ t −1 − = × ⎟ ( t −1 ( t −1 ⎞ ⎞⎛ ⎛ t −1 ⎜ x i , k + ∑ xi , j ∑ x i , j ⎟ ⎜ xi ,k + ∑ x(it,−j1 ⎟⎜ ∑ x(it,−j1 ⎟ j j ⎝ ⎠ ⎟ ⎟⎜ ⎜ j ⎠ ⎠⎝ j ⎝ ( t −1 t −1 If the value of xi , k is small compared with the total ∑ xi , j , we can make the approximation: j

⎛ t ⎞ ⎛ ⎞ ( ) ⎜ xi ,k + ∑ x)it, j ⎟∑ x(it,−j1 − ⎜ xit,−k1 + ∑ x(it,−j1 ⎟∑ x)it, j xit,k ∑ xit,−j1 − xit,−k1 ∑ xit, j ⎜ ⎟ ⎜ ⎟ j j j j ⎠ j ⎝ ⎠ j = 100 × = I kM ≅ 100 × ⎝ 2 2 ⎛ ( t −1 ⎞ ⎛ ( t −1 ⎞ ⎜ ∑ xi , j ⎟ ⎜ ∑ xi , j ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ j ⎠ ⎝ j ⎠ )t ⎛ xi , j ⎞ ⎟ xit,−k1 ⎜ xit,k ∑ j = 100 × ( t −1 ⎜ t −1 − ( t −1 ⎟ ∑j xi, j ⎜ xi,k ∑j xi, j ⎟ ⎝ ⎠ EQ 5

EQ5 Thus, the influence is approximately equal to the weight of the unit j in t-1, multiplied by the difference between the rates of variation of the unit and of the whole branch. In a similar way, for the annual influence we can compute:

STATISTICAL DATA EDITING – Improving

I kA

Quality

xit,−k12 ≅ 100 × A ( t −12 ∑ xi , j j

279

) ⎛ t ∑j xit, j ⎞⎟ ⎜ xi ,k ⎜ t −12 − ( ⎟ ⎜ xi ,k ∑j xit,−j12 ⎟ ⎠ ⎝

Where:

⎛ ( ⎞ A = ⎜⎜ ∑ xit,−j1 ⎟⎟ ⎠ ⎝ j

−1

) ⎛ xit,−j s +1 ⎞ ∑ 11 ⎜ ⎟ ) t −11 j ⎜∏ ( t − s ⎟∑ x i , j ⎜ s = 2 ∑ xi , j ⎟ j j ⎝ ⎠

The factor A is common for all units k, so it has no effect for comparison between the units of a branch. H. Selection The questionnaires are then distributed according to Table 3.

CASE Validated Non-Validated and Monthly influence in [-1%,1%] Non-Validated and Monthly influence out of [-1%,1%]

EDITING The microdata with greater influence are selected for manual editing. The threshold is decided by the editing team depending on several criteria, such as the weight of the branch and results from the macro editing. The microdata are edited manually one by one. Confirmation with the reporting enterprise is requested. Table 3

4. FINAL REMARKS A prerequisite of any editing strategy is obtaining high quality incoming data. The problem of how to get high quality incoming data can be faced from the perspective of the Total Quality Management (TQM). Using a TQM approach, it is considered that the suppliers (i.e. the reporting enterprises) are part of the production system. Hence, the statistical process begins with the production and transmission of microdata by the reporting enterprises. Data collection has to be adapted to the respondent conditions and possibilities. Also, statistical agencies should implement strategies to encourage the respondent to fill in questionnaires with confidence and care. A key success factor in achieving high quality incoming data is improving our relationship with reporting enterprises. One of the ways this is being achieved is by offering the enterprises free of charge data tailored to their needs. According with our experience, this new practice (Gonzalez and Revilla, 2002) has seen an increase of interest on the part of the enterprises, which are consequently filling in questionnaires more carefully. Moreover, an “auditing system” carried by reporting enterprises comes from that practice (Revilla, 2005). The combination of the TQM approach, the built-in edits using in EDR and the selective editing strategy appears very promising. Our target for the future is that, after implementing correct Web edits, no traditional microediting will be needed. A selected editing approach based on statistical modelling will be used in a way that the most influential suspicious values could be detected. Hence,

280

The Effect of Data Collection on Editing and Data Quality

all fatal errors and the most important query errors could be corrected before the survey is disseminated. References [1] Branson, M. (2002), “Using XBRL for data reporting”. Australian Bureau of Statistics. Unece/Eurostat Work Session on Electronic Data Reporting, Working Paper No 20. February 2002. [2] Gonzalez, M. and Revilla, P. (2002). “Encouraging respondents in Spain”. The Statistic Newsletter. OECD. Issue No. 12. [3] Gradjean, J.P. (2002), “Electronic data collection in Official Statistics in France”. French National Institute of Statistics INSEE. UN/ECE/Eurostat. Work Session on Electronic Data Reporting. Working Paper No 7. February 2002. [4] Martin, C. and Poirier, C. (2002), “Analysis of data slices and metadata to improve survey processing”. Statistics Canada.. UN/ECE Work Session on Statistical Data Editing. Working Paper No 2. Helsinki. May 2002 . [5] Mayda, J. (2002), “Experiences with implementation of EDR into existing survey programs 2002”.Statistics Canada. UN/ECE/Eurostat. Work Session on Electronic Data Reporting. Working Paper 23. February 2002. [6] Revilla, P. (2005), “Auditing by reporting enterprises”. 55th Session of the International Statistical Institute. Sydney. April, 2005. [7] Weir, P. (2005), “The Movement to Electronic Reporting and the Impact on Editing”. 55th Session of the International Statistical Institute. Sydney. April, 2005.

STATISTICAL DATA EDITING – Improving

Quality

281

EVALUATION REPORT ON INTERNET OPTION OF 2004 CENSUS TEST CHARACTERISTICS OF ELECTRONIC QUESTIONNAIRES, NONRESPONSE RATES, FOLLOW-UP RATES AND QUALITATIVE STUDIES By Danielle Laroche, Statistics Canada

1. INTRODUCTION By law, Statistics Canada (STC) is required to conduct a census every five years, and all households are required to complete a census questionnaire. The last census took place in 2001, while the next is scheduled for May 16, 2006. The de jure method is followed, meaning that people are counted at their usual residence in Canada regardless of their location on Census Day. In 2001, Canada had nearly 12 million households and a population of more than 30 million. Approximately 98% of households were counted using the self-enumeration method and the remaining 2% counted via interview. Major changes are being implemented for the 2006 Census in relation to data collection and processing methods. These changes are primarily as follows: questionnaires will be mailed out to twothirds of dwellings, with enumerators delivering questionnaires to remaining dwellings as in the past; completed questionnaires will be returned to a single processing centre rather than to enumerators; questionnaires returned via the mail will be scanned and their data captured automatically; telephone follow-up for incomplete questionnaires will be conducted from the Census Help Line sites using a Computer Assisted Telephone Interview (CATI) application; and, last but not least, all households in private dwellings will have the option to complete and submit their questionnaire via the Internet. Use of the Internet in data collection is not new to Statistics Canada; several business surveys have been conducted using this collection method. However, the Internet option for household surveys is relatively recent. That is, it has only been utilized in the Census, first with a major test during the 2001 Census, then as part of the Census Test in May of 2004 when Statistics Canada carried out the testing of almost all systems and operations to be used during the 2006 Census. As for the Census, two types of questionnaires were used to collect the majority of the Census Test data. The short-form questionnaire, or form 2A, was distributed to four households out of every five. The long-form questionnaire, referred to as form 2B, was distributed to one of every five households. This questionnaire contains all of the questions appearing on form 2A plus questions on a range of various topics. The 2004 Census Test was conducted within a limited number of test regions in Nova Scotia, Quebec, Manitoba and Saskatchewan. These sectors were selected based on their socioeconomic characteristics, proportion of francophone and anglophone households and availability of farms with a view to establishing a sample of both mail-out and list/leave areas. This report describes the characteristics of the electronic questionnaires, presents the main results of the qualitative tests and summarizes the analyses of partial non-response and failed-edit follow-up rates for the 2004 Census test questionnaires. Assessment of the preliminary quality of the data collected consisted of analysis of partial non-response rates and the percentage of questionnaires sent to failed-edit follow-up. These two indicators are key factors in determining the initial data quality that can reveal problems related to specific questions. 2. CHARACTERISTICS OF ELECTRONIC QUESTIONNAIRES In general, the electronic questionnaire is identical to the paper questionnaire in terms of question wording, instructions and response options. In terms of functionality, respondents have the option to save an Internet 2B questionnaire and complete it over multiple sessions from different

282

The Effect of Data Collection on Editing and Data Quality

computers. Respondents can also switch the questionnaire language, i.e., French or English. Additionally, the Internet application was made to be as consistent as possible with standards and guidelines for presentation of federal government Web sites. All Web pages of the Government of Canada strive to have similar characteristics, and the electronic questionnaires follow the same straightforward, efficient format. The Government of Canada places great emphasis on the uniform presentation of its Web sites, which provides the benefit of ensuring users a consistent experience visit after visit. The use of strict standards also enhances the satisfaction level of respondents using government sites. The first screen is a welcome page giving users the opportunity to verify their computer requirements and settings. If respondents have an adequately recent and properly configured browser, then the next screen is displayed. If not, then respondents can click links to view troubleshooting information to help them modify their browser settings or download a more recent version of the required browser or Java Virtual Machine (JVM). The second screen is the access code page. Access codes are printed on the front page of the paper questionnaires. They are unique, randomly generated, 15 digit numbers segmented into five groups of three digits to make them more user-friendly. After entering their unique access code, respondents click the Start button at the bottom of the page to validate their code. If the code is valid, then the application automatically selects the appropriate questionnaire type in the appropriate language and displays the first page of the questionnaire. In the first part of the census questionnaire, respondents are asked to provide their telephone number and household address. If the household received its questionnaire via the mail, then the access code is associated with the address in the Master Control System (i.e. master list of dwellings which includes addresses used for mail-out), and the address is displayed for the respondent to confirm. The paper and electronic questionnaires are identical in this regard. If the dwelling is in a list/leave area where the questionnaires are delivered by enumerators, the respondent is required to provide the household address. The Province field has a drop-down list containing the possible response options. Respondents simply click this field to view a list of provinces and territories. In the electronic questionnaire, all fields for province selection use this drop-down list format. The month and day of the date of birth question follow the same format. A bar representing respondents' completion status or progress in the questionnaire appears in the left-hand column on all screens. This feature provides respondents an indication of how much of the survey they have completed and how much remains. A help function appears in the form of a link in the left-hand column on the screen under the completion status bar. Respondents can click the help link for assistance with the current question. The help function contains instructions and examples to help respondents ensure that their responses are as accurate as possible to all questions. In addition to the help link, the left-hand column is configured with explanations for respondents as to why they are being asked each question and how the information they provide will be used. This information has been found to enrich the user experience when completing the online census. Questions and response options appear in a box at the centre of the screen to make them stand out. Each question is displayed on a background colour corresponding to the colours used in the paper questionnaire. Internet standards are followed, including check boxes to indicate that multiple responses are possible and circles, or radio buttons, to indicate that only one response is possible. When a radio button associated with a write-in response is selected, the cursor moves automatically to the field in

STATISTICAL DATA EDITING – Improving

Quality

283

which respondents are expected to type their answers. If respondents simply start typing a response in a write-in field, then the corresponding radio button is selected automatically. Respondents navigate within the questionnaire using control buttons located at the bottom of each screen. When respondents click the Next button, the data are encrypted and sent to Statistics Canada’s secure server, where they are then decrypted and verified. If no problem is detected with the answers, the server sends to the respondent’s browser the next appropriate screen. If there is a problem with the respondent’s answers, the server encrypts the current screen’s data, sends it back to the respondent’s browser where it is decrypted and displayed with a validation message. This process is called two-way encryption. That is, data entered by respondents are encrypted by their browser and sent to the server where they are decrypted. Data coming from the server are encrypted and sent to the respondent’s browser where they are decrypted. Respondents may also return to the previous screen by clicking the Go Back button. The Stop and Finish Later button gives respondents the option to save a partially completed questionnaire and fill in the remainder later. After clicking this button, respondents are prompted to choose a password or to let the application assign one. When they return to finish their questionnaire, they are prompted to enter their original access code and then given five attempts to enter their password correctly. For security purposes, if respondents are unsuccessful or if they do not log back in within the prescribed period, then their partially completed questionnaire is submitted on their behalf. Finally, respondents can click the Cancel button to simply exit a session. No data are saved if they exit in this manner. Four types of validation messages are possible. Non-response messages appear when respondents have not answered a question. Partial response messages appear when respondents provide only a partial response to a question, for example, if they omit the city name from their address. Invalid response messages appear for numerical responses when respondents enter a number outside of the range established for a question. Finally, soft edit messages appear only for questions relating to money amounts whenever the amount in a response appears unusual. This type of message asks respondents to verify that they have entered the correct amount, for example, “Please verify the amount you entered for part (f), if correct leave as is”. All of these messages follow the same approach. When respondents click the Next button, the information on the current page is validated, and, if necessary, the application displays the same screen again noting any problems at the top of the page in red text, for example, "Please answer question 5 about John Doe." The question and field requiring attention appear in red, and a red arrow highlights the missing response to assist the respondent, who can then either fill in the missing information or continue to the next screen. If the respondent chooses to move on without making any changes, then the next screen is presented. If the respondent adds or changes any information, then the responses are validated again. This approach is consistent with the Common Look and Feel guidelines prescribed for Canadian government Web sites in that pop-up windows should not be used within pages to convey information to respondents. The electronic questionnaire follows primarily the matrix format but also, in places, the sequential format. With the matrix format, each question appears only once, and response options are repeated under the name of each person in the household. Usability tests have demonstrated that this format reduces the response burden, since respondents have to read each question only once and can then respond for all members of the household. Another advantage of the matrix approach is that it reduces the number of screens and, as a result, requirements with regard to system infrastructure. With the sequential format, questions are asked about one person at a time. As a result, questions are repeated as many times as there are persons in the household. The sequential format supports increased customization of questionnaires. For one, it allows a respondent's name to be directly incorporated into each question. The sequential format is used in two places on the electronic version of form 2B: questions 40 to 46 concerning labour market activities and question 52 on income. Usability tests have indicated that it is easier for respondents to focus on one person at a time in responding to these particular questions.

284

The Effect of Data Collection on Editing and Data Quality

The electronic version of form 2B has two types of automated skip patterns. The first relates to all members of the household. With this skip type, any questions deemed non-applicable are not displayed. However, since the questions are numbered, a message appears at the top of the screen for the next applicable question to indicate to the respondent that one or more non-applicable questions were skipped. In this situation, a skip message like this would be presented: “Based on your response to Question 11, Question 12 is not applicable. Therefore, proceed with Question 13”. The second type of automated skip relates to questions applicable to one or more persons but not to all persons in the household. With this skip type, the question must be displayed since it applies to some members of the household. In this event, a message appears under the names of persons as appropriate to indicate that the question does not apply to them. By skipping any questions deemed non-applicable, we hope to reduce the response burden, thereby making the user experience more pleasant and less frustrating in comparison to the paper questionnaire. For example, with paper questionnaires respondents sometimes do not follow skip instructions which results in a considerable increase in response burden. The electronic questionnaire has two mandatory questions, one on the number of persons staying at the address on Census Day (Step B1), the other on the names of the members of the household (Step B2). The second screen depends on the first; for example, if a respondent indicates "3" for the number of persons in the household, then the application generates three lines to type the names of these persons. The respondent must provide a first or a last name for each person. These names are subsequently used to customize responses and selected questions. These two questions are the only mandatory questions in the Internet questionnaire. One of the main differences between paper questionnaires and their electronic versions relates to the population coverage Steps questions. In the electronic questionnaire, respondents are required to enter the names of all persons staying at their address on Census Day and three additional questions are then used to eliminate any temporary residents (persons whose usual residence is at another address in Canada) or foreign residents (visitors or government representatives from another country). On the paper questionnaire, respondents list only the names of usual residents at their address. The three questions of the electronic questionnaire are presented in the form of instructions on the paper questionnaire. In either response mode, if all residents at an address are temporary or foreign, then respondents (i.e. the persons filling out the questionnaire) list their name and usual telephone number, and there is no need to respond to any further questions. The application can accommodate up to 36 persons, although it is possible to indicate more persons in the corresponding question. Age is particularly important in the long-form questionnaire (2B), as persons less than 15 years of age are subject to different validation for the questions on marital or common-law status and are not required to respond to a portion of the questionnaire, including questions on mobility status, education, activities in the labour market and income. Age confirmation provides respondents the opportunity to verify the ages calculated based on the responses given to the question on date of birth. This confirmation occurs only for the electronic questionnaire. If a date of birth is not indicated, then the message "Date of birth not indicated" appears next to the appropriate person's name; if it is invalid, then the message "Impossible to calculate age" appears. Respondents can go back to the previous screen to modify their response to the date of birth question or go on to the next screen. If they opt to move on or not to provide a date of birth, then the person in question is presumed to be more than15 years of age, and all applicable questions on the form 2B are asked. On the last page of the electronic questionnaire, the completion status bar indicates that the questionnaire is complete. Respondents then have the option to document any suggestions or comments in the designated space. The Submit button is located at the bottom of the page. When respondents click this button, the application submits the questionnaire and displays an acknowledgment page containing a confirmation number. Respondents can click a button at the bottom of this page to print the number to retain as evidence that they have submitted their questionnaire in case an enumerator telephones or knocks at their door.

STATISTICAL DATA EDITING – Improving

Quality

285

3. RESULTS OF QUALITATIVE STUDIES (INTERNET RESPONDENTS) Qualitative studies of the electronic version of form 2B were carried out in respondents' homes in May and June of 2004, in parallel with the 2004 Census test. Approximately 50 interviews were conducted in both official languages. Most respondents stated that they enjoyed using the application and that they found their experience to be positive and pleasant. They found the electronic questionnaire easy to complete, efficient and user-friendly. Some respondents stated that they appreciated not having to fill out the form manually, while others said it was easier to correct mistakes on the electronic copy. The majority of respondents liked the interactive nature of the application and its "smart" features (personalized questions and answers with the respondents' names, age confirmation, automated skips, etc.). Moreover, the collection method saved respondents the trouble of going to the post office or finding a mailbox. One of the problems found was not related to the questionnaire itself but rather to the process for gaining access to it. This is because respondents who had to modify their computer settings were generally unable to do so without assistance from an interviewer. The majority of respondents did not know what type or version of browser they had or how to find this information. This could mean lower response rates at census time as a result of unsuccessful attempts to gain access to the electronic questionnaire. During qualitative tests, it seemed to us that the 2B Internet questionnaire did not offer respondents significant time savings compared to paper, and may in fact have taken longer in some cases. However, data collected by the Internet application, that is, from the time each respondent logged-in to the time the respondent submitted their form, did not confirm this. In fact, the electronic questionnaires take less time to complete than the paper version, except in cases where Internet respondents do not use a high-speed connection. In addition, most Internet respondents, even those with “dial up” connections, reported that they thought the electronic version was faster to use and much more efficient than the paper version. This perception was likely due in part to the interactive aspects of the application and to the fact that after respondents complete the electronic questionnaire, they can submit it instantly as opposed to the paper version, which must be mailed. Validation messages, particularly those for non-response or partial response, were effective in that they helped to ensure that respondents provided responses for questions they had inadvertently skipped. In spite of this, respondents found the messages to have a negative connotation in that they only appeared when respondents did something "wrong." In addition, generally held conceptions of Internet respondents with regard to how forms on the Internet should be completed can lead them to believe that they must provide a response to every single question, without exception, before moving on. For the Census, this false impression could result in undesirable behaviours that might impact data quality. For example, some respondents might invent responses or pick them at random in order to move on. An issue of this nature may be difficult to detect. 4. COMPARISON OF PARTIAL NON-RESPONSE RATES FOR INTERNET AND PAPER QUESTIONNAIRES (UNWEIGHTED) The rate of partial non-response to questions can provide an indication of the difficulties encountered by respondents. It is useful in assessing the overall level of understanding of questions and provides a preliminary indication of data quality. A question is generally deemed to be missing a response when a response is required but none has been provided. However, the situation may not always be as obvious. For example, if question 11 (landed immigrant) has no response, it is impossible to know whether a respondent was or was not supposed to respond to question 12 (year of immigration). In this event, it is presumed that the respondent was not supposed to respond to the

286

The Effect of Data Collection on Editing and Data Quality

question. As such, the rate of partial non-response studied in this section corresponds to actual rates in that only responses for which we were certain there should have been a response were taken into consideration. For some questions, the rates also include incomplete and invalid responses. These rates are unweighted and were calculated based on all non-blank questionnaires returned. Wherever possible, the Internet questionnaire mirrored the paper form in terms of question wording, instructions provided and response options. However, in comparing non-response rates, certain characteristics of the electronic questionnaires need to be taken into account. Firstly, the use of radio buttons in the response options for certain questions makes it impossible to select more than one response for these questions. Secondly, when respondents click "Continue" at the bottom of each page, any appropriate non-response, partial response, invalid response (for numerical responses) messages or soft-edit messages (for amounts) are displayed automatically. Respondents can then choose to enter or correct a response or to simply move on. In addition, all skips are automated. Form 2A is divided into two parts, coverage steps and persons data. Form 2B has three parts: coverage steps, person data and household data. In terms of coverage steps, we did not evaluate nonresponse but rather only situations requiring confirmation from a household. These results appear in the section on rejection rates. The following chart illustrates the non-response rates calculated for both collection methods by questionnaire type (i.e. 2A or 2B) and data type (i.e. person data or household data). Chart 1: Partial Non-Response Rates by Questionnaire Type and Data Type

16%

14.04%

Non-Response Rates

14% 12% 10% 8%

6.91%

6% 4% 2% 0%

2.54%

8.30% 6.19%

Internet Paper

2.15%

1.78%

0.01% 2A Persons

2B Persons

2B Household

2B Total

As the chart indicates, the non-response rate was almost nil for the Internet version of form 2A (0.01%), while the same rate was 2.54% for the paper questionnaire. For form 2B, the nonresponse rate for person data was four times as high for the paper questionnaire (6.91%) as for the Internet version (1.78%). With regard to household data, the non-response rate for the paper version of form 2B was twice as high (14.04%) as for the Internet version (6.19%). When all questions are considered together, the rate of partial non-response of 2B was nearly four times as high for the paper questionnaire (8.30%) as for the Internet version (2.15%). It appears, therefore, that the validation messages used for the Internet version are effective in reducing non-response. Qualitative studies appear to confirm this hypothesis. However, these studies have also revealed that many Internet respondents believe that they must provide a response to every single question before moving on. For some respondents, non-response messages are interpreted as a

STATISTICAL DATA EDITING – Improving

Quality

287

confirmation of this. Nevertheless, it would appear that the non-response messages assist inattentive respondents in correcting the majority of unintentional errors or omissions. Analysis of rates of partial non-response for each question indicates that some question types are more likely to have non-response than others for either collection method. These are generally questions that respondents deem non-applicable (for example, question 5 on common-law status for married respondents or young children, question 8B on activity limitations at work or at school for respondents not working or attending school, question 30 on major field of study for respondents without certificate or for those who only have a secondary certificate, question 33 on the number of hours spent caring for children for respondents without children or question 39 on the last date worked for respondents who are not working or have never worked). Other questions, meanwhile, are more difficult to respond to as a proxy respondent, i.e., on behalf of another person (for example, question 17 on ethnic origin, question 43 on the main activities performed at a person's work or question 46 on place of work). Finally, still other questions are simply more difficult to respond to in that they depend on memory recall (for example, question 24 on mobility status over the previous five years or question H5 on period of construction of dwelling) or require retrieval of documentation (for example, question 52 on income or the questions in section H6 on amounts paid for electricity, oil, gas, wood, water or other municipal services) or in that respondents do not know or refuse to provide the answer (for example, question 52 on income or on income tax paid). In most cases, the types of questions listed above tend to have higher non-response rates for both collection methods. However, these rates are consistently twice as high for paper as for electronic questionnaires. According to this analysis, non-response rates also increase as we progress on the questionnaire regardless of collection method. Internet questionnaires include both questionnaires submitted by respondents and those submitted automatically by the system (that is when a respondent saves a questionnaire and then does not return within the prescribed time, the system submits the questionnaire on the respondent's behalf). In addition, respondents who did not respond to the question on date of birth were required to respond to all questions on form 2B (although this proportion was negligible at 15 cases out of 7,526, or 0.02%). This explains in part why non-response rates tend to increase as we progress in an Internet questionnaire. Additional factors to explain this increase might include increasing question difficulty, increasing effort required or respondent fatigue. 5. COMPARISON OF FOLLOW-UP RATES FOR INTERNET AND PAPER QUESTIONNAIRES (UNWEIGHTED) After data entry, questionnaires are transmitted to data processing where automated edits are performed. These edits identify any questions requiring follow-up and compiles a score for each household. Follow-up requirements are calculated based on these scores and weights assigned to each question. The data for households with a score exceeding a predefined value are forwarded to followup. These households are grouped to classify follow-ups by priority; those with the highest scores are given higher priority in terms of number of contact attempts. In other words, the higher a household's score, the higher its priority for follow-up. In terms of coverage steps, we do not evaluate non-response but rather the situations requiring verification with households. The first situation in this regard concerns households identifying themselves as temporary or foreign resident households (Step B). It is necessary to confirm that these households actually do include only temporary or foreign residents. The second situation relates to doubts concerning the exclusion of a person from the questionnaire (Step C). In the event of doubt concerning whether to exclude a person from a questionnaire, respondents are supposed to indicate the person's name, the reason for exclusion and the person's relationship with Person 1. It might be necessary to contact the household to confirm that the person is, in fact, not a usual resident at the address. Should one of these situations arise that cannot be resolved through analysis of the questionnaire, then verification must take place with the household.

288

The Effect of Data Collection on Editing and Data Quality

The following chart illustrates the follow-up rates by reason for forms 2A. Chart 2: Follow-Up Rates by Reason – Forms 2A

7.00%

5.88%

Follow-Up Rates %

6.00% 5.00%

3.82%

4.00% 3.00% 2.00%

2.17%

Paper

2.21%

2.06%

1.00%

Internet

0.04%

0.00% 2A Coverage

2A Content

2A Total

In terms of coverage steps, the Internet and paper versions of form 2A rank more or less equivalent (2.17% and 2.06%). For either collection method, nearly three-quarters of all coverage issues relate to Step C. In terms of content, on the other hand, follow-up rates are negligible for the electronic version of form 2A (0.04%) in comparison to the paper version (3.82%). Overall, the paper versions of form 2A require follow-up three times as often as the Internet versions (5.88% versus 2.21% respectively). The following chart illustrates the follow-up rates by reason for forms 2B. Chart 3: Follow-Up Rates by Reason – Forms 2B

36.99%

Follow-Up Rates %

40.00% 35.00%

34.96%

30.00% 25.00% 20.00% 15.00% 10.00% 5.00%

Internet Paper 7.12% 2.25%

9.36%

2.01%

0.00% 2B Coverage

2B Content

2B Total

With regard to coverage steps, follow-up rates for form 2B are similar for either collection method (2.25% for electronic versions, 2.01% for paper). These rates are also similar to those for form 2A, with the majority of coverage issues relating to Step C for either collection method. With regard to content, the follow-up rate was nearly five times as high for the paper questionnaire (34.96%) as for the Internet version (7.12%). Overall, the follow-up rate was nearly four times as high for the paper questionnaire (36.99%) as for the electronic version (9.36%). The following chart illustrates the follow-up rates by reason for all questionnaires combined.

STATISTICAL DATA EDITING – Improving

289

Quality

Chart 4: Follow-Up Rates by Reason – Forms 2A and 2B Combined

11.42%

Follow-Up Rates %

12.00% 9.37%

10.00% 8.00%

Internet

6.00% 4.00% 2.00%

3.65% 2.18%

2.05%

Paper

1.47%

0.00% 2A + 2B Coverage

2A + 2B Content

2A + 2B Total

In terms of overall coverage steps, follow-up rates were similar for the electronic and paper versions (2.18% and 2.05% respectively). Follow-up rates relating to content were six times higher for paper questionnaires than for their electronic versions, at 9.37% and 1.47% respectively. Overall, the follow-up rate was three times as high for the paper questionnaire (11.42%) as for the electronic version (3.65%). The analysis of non-response and follow-up rates leads to the conclusion that the electronic questionnaires are more complete than the paper questionnaires and consequently less expensive in terms of follow-up than the paper questionnaires. 6. CONCLUSION Data collection via the Internet offers a range of both new possibilities and new challenges. As demonstrated in this report, partial non-response rates and follow-up rates (i.e. failed edit questionnaires) are much lower for electronic versions of questionnaires than for their paper versions. With regard to the validation messages, it is possible to conclude that in general, they are effective in obtaining answers to questions respondents might otherwise have overlooked and in having them correct errors they inadvertently committed. Along with the automated skips of non-applicable questions, these messages result in a general perception among respondents that the electronic questionnaire is “smart”. This responds in part to the high expectations the general public have with regard to Internet questionnaires. The results provided in this report are thus highly encouraging. They demonstrate that data collected via the Internet are more complete and consequently less expensive from a failed edit and follow-up perspective than data collected via paper questionnaires. Moreover, electronic questionnaires are ready for processing upon submission since they are already in electronic form. A number of countries have invested significant financial and other resources into this new collection method, as the new technology offers hope in terms of reducing the cost of future censuses if we obtain a high take-up rate using this methodology. The infrastructure for collecting data via paper will always be required, at least for a Census. With an on-line option, organizations must invest additional funds for a second infrastructure. The saving can only be realized if the Internet take-up rate is high enough to reduce the paper processing infrastructure and eliminate operations costs related to processing paper questionnaires. However, it is also appropriate to explore more fully the reasons for the differences in nonresponse and follow-up rates between these two data collection methods. Do the differences in format between electronic and paper questionnaires influence the quality of the responses collected? Are data

290

The Effect of Data Collection on Editing and Data Quality

collected from households via the Internet and via paper questionnaires strictly comparable? Is the quality of responses equivalent from one collection method to the next and do we need to implement additional processing steps to control for the differences? More detailed studies are necessary and are currently underway with a view to finding the answers to these questions. References [1] Laroche, Danielle (2004) Census of Population Data Collection Via Internet. Statistics Canada, SSC Montréal, May 31, 2004. [2] Boudreau, J.-R. and Bornais K. (2003) Les conclusions de l’étude du contrôle et du suivi. Statistics Canada, 20 février 2003. [3] Statistics Canada, 2006 Census Processing Response Integration and Verification Task – Coverage and Collective Edit Systems Requirements. Version 9 – February, 2005. [4] Statistic Canada, Bornais, K. (2004) Rates by response Channel. Internet questionnaires derived from Booklet Id. E-mail October 21, 2004.

STATISTICAL DATA EDITING – Improving

Quality

291

DATA EDITING FOR THE ITALIAN LABOUR FORCE SURVEY By Claudio Ceccarelli and Simona Rosati, National Institute of Statistics, Italy

1. INTRODUCTION The European Council has approved a new Regulation (n.577/98) that aims to produce comparable information about labour market in the European Member States. The Council Regulation requires that Labour Force Surveys have to be carried out complying with both the contents and the methodologies that are set at Community level. In order to accept this Regulation the Italian Labour Force Survey (LFS) has been completely revised. The rotating panel design, the frequency of the survey and the complexity of the questionnaire are the main reasons which led to adopt a different data collection strategy, which is a combination of different Computer Assisted Interviewing (CAI) techniques. More exactly, CAPI (Computer Assisted Personal Interviewing) technique is used for the first interview, while CATI (Computer Assisted Telephone Interviewing) technique is used for the interviews following the first. To conduct the interviews with CAI technique, professional interviewer network has been realised. As known, one of the major advantage of using CAI is the improving of data quality, especially in case of complex questionnaires. This is mainly due to the fact that the use of CAI allows to reduce large amounts of potential errors due to interviewer or to respondent. For example, routing errors are eliminated because the script automatically routes to the next questions. In addition, CAI makes it possible to check range and data consistency during the interview. As a consequence, the problem of imputing missing or inconsistent data at the post-data capturing stage is significantly reduced, even if not completely solved. A certain amount of records indeed may present a few internal inconsistencies, since some edit rules may be left unsolved during the interview. Moreover, the choice to introduce a limited number of edits, in order to do not compromise the regular flow of the interview, increases the number of records with unsolved logical inconsistencies. Nevertheless, most of errors do not have an influential effect on publication figures or aggregate data, which are published by Statistical offices. On the other hand, if data have been obtained by means of a sample from the population, an error in the results due to the incorrect data is acceptable as long as this error is small in comparison to the sampling error. The question we pose is: “It is necessary to correct all data in every detail when CAI is used as data collection method?”. From a first point of view, data which are collected by means of computer-assisted methods would not need to be further corrected. On the other hand, if some edits are not solved during the interview, some records will remain internally inconsistent and may lead to problems when publication figures are calculated. Data editing helps to reduce the errors and make sure that records become internally consistent, but a caution against the overuse of query edits must be heeded. It is well recognized that the ideal edit strategy is a combination of selective editing and automatic editing, although macro-editing can not be omitted in the final step of editing process. For our purpose automatic editing is especially suited for identifying all incorrect records, since performing this activity is low cost and not time-consuming. After that phase, the records are split into two stream: critical records (they may have an influential effect on aggregate data) and noncritical records (they do not affect significantly aggregate data). The critical records which are incorrect for systematic errors are imputed by a deterministic algorithm, while those which are incorrect for probabilistic errors are automatically imputed according to the Fellegi and Holt methodology (1976). The remaining noncritical records, which may be incorrect for different types of errors, could be either not edited or could be imputed automatically. If the latter solution is adopted, it is expected that software systems implementing the Fellegi and Holt method will not be able to function properly. This is also the case of the LFS. In this paper we do not give a solution to such a problem, but we are

292

The Effect of Data Collection on Editing and Data Quality

confident that useful results will be provided in the future. Suggestions, which we intend to verify, are presented in the last section. The work is organised as follows: the characteristics of the LFS sample are presented in Section 2; Section 3 analyses the survey process as a whole; the edit strategy for the LFS is described in Section 4; some results about the imputation are also reported. The paper concludes with a brief discussion about the main outcomes and the issue concerning data editing in CAI surveys. For brevity all the results presented in the paper refers to 2004 fourth quarter data. 2. SAMPLE DESIGN The most relevant aspect about new LFS is the time continuity of reference weeks spread uniformly throughout the whole year. For this reason, all solutions adopted to realise the survey are conditioned by this characteristic. Sample design is realised taking into account Council Regulation 577/98 constraints about frequency of the survey and the representativeness of the sample. The sample has two-stage of selection. The primary units are the Municipalities stratified into 1,246 group at the Province level. The stratification strategy is based on the demographic size of the Municipality. The sample scheme provides for one primary unit for each strata. The secondary units are the households and they are selected from the Municipality population Register. For each quarter 76,918 households are interviewed (i.e. sampling rate equal to 0.4%). The LFS sample is designed to guarantee annual estimates of principal indicators of labour market at Province level (NUTS III), quarterly estimates at the Regional level (NUTS II) and monthly estimates at the national level. Council Regulation 577/98 indicates the quarter as the reference period about the estimates of principal indicators of labour market, so the sample of households to be interviewed is split into 13 sub groups homogeneous regarding to the size. In similar way to the old version of LFS, the sampling households follow the rotation scheme 2-2-2. For example, if we consider a sampling household that is interviewed in the third week of the first quarter of 2005, it will be interviewed again in the third week of the second quarter of 2005, then it will go out to the sample in third and fourth quarters of 2005 and, finally, it will be interviewed again in the third week of first and second quarter of 2006. The weighting factors are calculated taking into account the probability of selection and the auxiliary information relating to the distribution of the population being surveyed, by sex, age (fiveyear are groups) and region (NUTS II level), where such external data are referred to the Register statistics. The weighting strategy adopted by LFS is developed in the following there steps. In the first one, the initial weights equal to inverse of probability of selection are calculated. In the second step, non-response factors, by household size and referent person characteristic are calculated to correct initial weight from non-response effects. Finally, final weight relating to the distribution of the population being surveyed, by sex, age (five-year are groups) and region are calculated using calibration estimators methodology. 3. SURVEY PROCESS Quality in statistics refers to an undefined concept. Each survey process is based on a multiple elementary operations, which have an influence over the final outcome, consequently quality has progressively become a central issue in official survey designing. Starting from these considerations, the new LFS is planned to analyse the process, in terms of elementary activities. In this way, it’s possible to associate each activities to a specific type of error and to define a complex quality monitoring system for all steps of the survey.

STATISTICAL DATA EDITING – Improving

Quality

293

In order to control all elementary actions about the LFS process, the quality monitoring system is introduced and can be divided into three main phases: “preventive checks phase”, regards some of preventive actions implemented before data collection, to prevent the errors; “checks during the survey phase”, regards some actions for detecting the errors that can be made during the process; “a posteriori checks”, to evaluate non sampling error using quantitative indicators linked to each elementary operation. Double strategy of interview (the first one with CAPI and the other three with CATI) call for a high level of automation on the interviews management, transmission and execution, which have to be strictly observed. Monitoring system give the possibility to put the new LFS process under control and to adjust it in “real time” in case of error (Giuliani et al, 2004). Mixed mode techniques has been due to the need of interviewing the households belonging to the sample for four waves: the first interview is generally carried out by CAPI technique, the three successive interviews are carried out by a CATI technique. In particular cases, for example when the household do not have a fixed telephone, the previous scheme can be changed. To carry out CAPI interviews, ISTAT decided to realise a network of 311 interviewers, one for each specific sub-regional area, directly trained and managed by ISTAT experts. Each ISTAT Regional Offices manage weekly interviewer job. Through a web connection the interviewers receive the electronic questionnaire and related changes and the information about the weekly sampling households. ISTAT provided to realise a complex automatic system that has to be able to manage all information regarding the core of new LFS. In particular, the system manages the flows of data between sampling Municipalities and ISTAT, allocates the interview according to techniques, governs the flows of information in data treatment phase and so on. The information system guarantees that all activities, achieved by reliable hardware and software tools, respects all the requirements of a secure information system (data privacy, data availability and data integrity) and simultaneously, provides for monitoring system based on a series of indicators defined to plan and control all the elementary activities. (Bergamasco et al, 2004). The sample Municipalities select a list of households that are the theoretical sample to be interviewed. This sample is composed by “base household” and its three “replacing households”. This information are collected and managed by a software system called SIGIF (SIstema Gestione Indagini Famiglie – system for the management of households surveys). SIGIF distributes the sampling households according to data collection techniques (CAPI or CATI). Using SIGIF, ISTAT Regional Offices allocate to the CAPI interviewer the weekly sample of households. In order to solve eventual not planned events, ISTAT Regional Offices re-assign the job between interviewer. Each interviewer try to interview the “base household” and, if it is not possible, he can replace the household with a “replacing” one chosen automatically by the CAPI system. The interviewers have about 4 weeks to carry out the interview: the first week is dedicated to fix the appointments, the second week to the interview, the following two weeks to complete pending interviews. As soon as possible, the interviewers send, using web connection to a toll-free number, the results of the weekly job to the SIGIF divided into interview and field-work data (Bergamasco et al, 2004). Using SIGIF, the list of the households is sent to the private service company that has the responsibility of carrying out the telephone interview. Weekly, the service company sends to ISTAT the results of the interview and daily a set of field-work indicators. In Table 1 are reported the sample size of the LFS and the response rate by survey technique.

294

The Effect of Data Collection on Editing and Data Quality

Table 1. Sample size and response rate by survey technique

Households

Respondent households Response rate Sampling Universe Sample CAPI CATI Total CAPI CATI Total rate (%) 21,810,676 76,872 0.4 26,985 42,567 69,552 90.2 92.1 90.5 The European Council Regulations n. 1575/00 and n. 1897/2000 introduce other innovations, as typology of questions, type of variables (continuous or categorical), acceptance range of variables, and explore in details the concept of “unemployed person” and gives further methodological indications on data collection. In order to observe these European Council Regulations and simplify the interview, electronic questionnaire has developed. Characteristics like: automatic branching, online help, interactive codification of open items using a search engine for certain key variables (such as economic activity and profession), root and coherence rules, confirmation items for waves following the first, improve significantly data quality. The questionnaire is composed by a section regarding general personal information on household members. Individual questionnaire is divided into 9 sections that should be submitted to each member in working age (over 15 years); the themes treated are: labour status during the reference week, main job, second job, previous work experiences (only for the not employed persons), search of employment, registration to public and private employment office, education and training, main labour status and residence. To close the interview there are 2 section called: other information on the household, general trend of interview. The last section regards pending codification used if the interviewer do not codify some particular variables, like economic activity and profession, during the interview. Rotation scheme 2-2-2 of the sampling households allows to make a panel of information. Traditionally, ISTAT produce “three months transition matrix” and “one year transition matrix” referred to longitudinal information carried out for all sampling persons. To give the possibility to reconstruct the working history of the respondent for more than one year, the longitudinal information required about the labour status has been improved. European Council Regulation n. 1897/00 gives the possibility to conduct the interviews following the first asking to the sampling person to confirm the information given in the previous interview. In this way, when the labour status and/or other characteristics of the respondent does not have substantially changed we are able to reduce the time of interview. CAI technique help to realise a “confirmation questionnaire”, in which the “confirmation questions” are the fundamental nodes of interview flow. In case of the status of respondent is not changed, all information regarding a sub-flow of the questionnaire depending on certain “confirmation question” are automatically registered. The LFS process and the new interviewers network represent the core of a crucial transformation of ISTAT’s surveys, and has allowed, for the first time in ISTAT history, the implementation of an innovative survey management process. The attention given to data quality moved from the final product to the methodologies and the tools used to keep the whole production process under control. With this new system, the evaluation of the quality of the LFS, in terms of correctness and transparency, are set.

STATISTICAL DATA EDITING – Improving

Quality

295

4. DATA EDITING 4.1 Error detection After data captured contradictory information and item non-response are detected principally by using an automatic system based on Fellegi and Holt model of editing. It is worth noting that this operation can also be performed every week in order to notify in time possible errors due to the electronic questionnaire. In editing process some edits (rules) are specified by experts with a knowledge of the subject matter of the survey. An edit expresses the judgment of some experts that certain combinations of values, corresponding to the different questions on the questionnaire, are unacceptable. The set of such edit rules is referred to as explicit edits. A record which fails any of the edits does need to be corrected. Conversely, a record which passes all the stated edits does not need to be corrected. The explicit edits relating to categorical variables (i.e. variables which are not subject to a meaningful metric) are implemented in SCIA (System for editing and automatic imputation), developed by ISTAT according to the Fellegi and Holt methodology (Barcaroli and Venturi, 1997). More exactly, the version of SCIA included in CONCORD (CONtrol and Data CORrection), a system designed for Windows environment, is used (Riccini Margarucci and Floris, 2000). The remaining edits related to continuous variables are translated into a SAS program including if-then-else rules (SAS Institute Inc., 1999). Two different systems are used, given that a software which handle mixed data, i.e., a mix of categorical and continuous data is currently unavailable. Developing such a system is generally considered too hard, although in 1979 Sande showed how to this. He first showed how to convert discrete data to continuous data in a way that would allow solution of the error-localization problem. He then showed how to put a combination of discrete and continuous data into a form to which Chernikova’s algorithm could be applied (1964, 1965). To our knowledge only recently two researcher of Statistics Netherlands proposed an algorithm based on Fellegi and Holt methods for automatic editing of mixed data (de Waal and Quere, 2003). For a single variable we also distinguish two types of errors: “random error” and “systematic error”. Although it is not a clear cut on their definition, we say that random errors can be assimilated to normal measurement errors, they have equal probability to occur in different variables and, for a generic variable, they are not correlated to errors in other variables. Systematic errors may be defined as non-probabilistic errors; they are generally due to some defects in surveying structure (e.g. a question wrongly specified). Table 2 shows the number of explicit edits for the LFS. With regard to the SCIA system, the large number of edit rules and the complicated skip patterns of the questionnaire poses several computational problems that the entire set of edits need to be partitioned into two subsets. The problems arise since the Fellegi and Holt system runs to check the logical consistency of the entire edit system. We suppose that such a problem could be solved changing something into the software system by using more powerful processor. As result of the stage of error detection in Table 3 are reported the proportion of incorrect records per type of error. Table 2. Number of explicit edits

Edit SCIA system - Set 1 - Set 2 SAS system

Number of edits 2,358 961 1,397 142

296

The Effect of Data Collection on Editing and Data Quality

Table 3. Erroneous records per type of error

Error Systematic Probabilistic

Percentage of records 2.5 2.8

4.2 Imputation The imputation of item non-responses depends on the nature of errors which generated them, although practical considerations and other issues should be considered in implementation of imputation procedures. When missing data are imputed it has been shown that deterministic imputation methods distort the distributions and attenuate variances, whereas probabilistic methods yield approximately unbiased estimates of distributions and element variances (Kalton and Kasprzyk, 1982). Nevertheless, we can reasonably assume that deterministic imputation is more suitable for correcting systematic errors, while probabilistic methods are more specific for errors generated from random error models. Deterministic imputation assigns only one value, a priori determined, on the base of other values considered “true” by experts. On the contrary, probabilistic imputation assigns a value according to a stochastic model (e.g. a regression model) or using a donor unit which is similar to record to be imputed. A distance function is generally used to define similarity. The Fellegi and Holt methodology can be counted among the probabilistic methods. It is concerned primarily with the stages of data editing and of imputation. Data editing process localizes errors in the record identifying a minimal set of variables (the smallest set), whose values can be changed in such a fashion that the resulting record would pass all the edits. Finally imputation will provide suitable values for those variables. One of the major objectives of the Fellegi and Holt methodology is to retain the structure of the data. This means that univariate and multivariate distributions of survey data reflect as nearly as possible the distributions in the population. This goal is achieved by the use of hot-deck imputation. Hot-deck imputation consists of imputing for a variable of the current record the value recorded in the same field of some other record, which passed all the edits and which is similar in all respects to the current record. This can be done one variable at time (sequential imputation) or all variables at once (joint imputation). The former aims to preserve the univariate distributions of the variables, whereas the latter preserves the multivariate distributions. To solve the error-localization problem, Fellegi and Holt showed that both explicit and implicit edits are needed. Implicit edits are those that can be logically derived (or generated) from a set of explicit edits. If the implicit edit fails, then necessarily at least one of the explicit edits fail. For the LFS the current algorithm in the SCIA system can not generate the full set of implicit edits because the amount of computation needed for generation grows at very high exponential rate in the number of edits. The same limitations were found when the Fellegi and Holt system ran to check the logical consistency of the explicit edits. On the other hand, it seems to be not convenient to divide further the two subsets of explicit edits: we would obtain more than three subsets, which would be not independent each from other, with additional restrictions for the imputation method. Moreover, if some records are incorrect for systematic errors, it would be better correct them by using deterministic imputation. For these reasons we adopt a procedure which is a combination of selective editing and automatic editing. After errors detection, the incorrect records are split into two stream: critical records (they are influential on aggregate data) and noncritical records (they do not affect significantly aggregate data).

STATISTICAL DATA EDITING – Improving

Quality

297

The critical records which are incorrect for systematic errors are imputed by choosing one value a priori determined, while those which are incorrect for probabilistic errors are automatically imputed according to the Fellegi and Holt methodology. Currently, only probabilistic logical inconsistencies which do not depend on rules of questionnaire are automatically imputed. In order to achieve this aim 137 explicit edits are used to detect probabilistic errors and 213 edits form a complete set of edits to solve the error-localization problem, i.e., which variables change in an erroneous record to ensure that the record passes all of the edits. The remaining noncritical records, which may be incorrect for different types of errors, could be left unsolved without reducing data quality. Otherwise they could be automatically imputed too. In fact, the small number of errors and the variety of failing edits make the use of probabilistic imputation acceptable. At present, the remaining incorrect records are not automatically imputed since the amount of computation needed for generating the full set of implicit edits is prohibitive. These records are imputed by a deterministic algorithm whose implementation is not only time-consuming, but need to be modified for each quarter. In order to verify that the deterministic actions are correct, all the records are checked through the edit system a second time. If they pass, then nothing else is done. If they still fail edits, then it means that either some if-then-else rules need to be adjusted or that other rules must be considered in the algorithm. As result of the impact of imputation in Table 4 are reported some findings related to the records which have been corrected by sequential or joint imputation. Table 5 shows the percentage of records per number of imputations at the end of the editing process. Note that the percentage of records that are corrected for four or more variables is equal to zero. Really this percentage is equal to 0.05, which means that there are at least four variables that have been imputed in 100 records. The maximum number of imputations is equal to 16. Table 4. Erroneous records per type of imputation

Imputation Joint Sequential

Percentage of records 41.1 58.9

Table 5. Records per number of imputations

Number of imputations 0 1 2 3 4 or more

Percentage of records 95.5 3.5 0.7 0.3 0.0

5. CONCLUDING REMARKS The advantage and also the aim of imputation is to complete a data set with “full” information for all observed individuals, which reduces bias in survey estimates and – from a point of a data user – also simplifies the analysis. Nevertheless, when CAI is used the increase of data quality due to editing process is usually negligible. In addition, for large complex data set implementing an automatic system for imputing contradictory information may be too costly in terms of timeliness and resources.

298

The Effect of Data Collection on Editing and Data Quality

Therefore it seems to be unnecessary to correct data in every detail when CAI is used, while the information on error detection can play an important role in the improvement of the electronic questionnaire and the data collection. The LFS data could be checked frequently, since they are captured continuously week by week. Thus fatal errors can be identified just in time so that the errors can be eliminated. This process allows to get immediate feedback and monitoring on how interviewer are filling in the electronic questionnaire. Although the impact of editing is considerably reduced by new technologies, we found that some errors may continue to be in the data and may have an influential effect on aggregate data. A combination of selective and automatic editing may be used to correct errors. Nevertheless, we also argued how it was hard implementing a system based on Fellegi and Holt methods when numerous edits are specified. For the future we suggest applying the Fellegi and Holt method step by step. Initially the method is applied only for edits which fail. They are drawn from the set of the explicit edits. Then the records, which have been imputed, are checked through the edit system a second time. If they pass, then nothing else is done: all the records are correct. If the records still fail edits, the failing edits are added to the subset of edits previously selected in order to impute the incorrect records. The algorithm is reiterated until the records pass all the edits. References [1] AMENDOLA A. CAROLEO F.E., COPPOLA G. (1997) Differenziali territoriali nel mercato del lavoro e sviluppo in Italia, Celpe, Università di Salerno Discussion Paper n. 36. [2] APPEL M. V., NICHOLLS II, W. L., NICHOLLS, W. (1993) New CASIC Technology at the U.S. Census Bureau, ASA, U.S. Census Bureau, vol. 2, pp. 1079-1084. [3] BAFFIGI A. (1999) I differenziali territoriali nella struttura dell'occupazione e della disoccupazione: un’analisi con dati a livello provinciale (1981-1995). In: Struttura della contrattazione: differenziali salariali e occupazione in ambiti regionali, a cura di M. Biagioli, F.E. Caroleo e S. Destefanis. [4] BAKER R.P., BRADBURN N.M., JOHNSON R.A. (1995) Computer-assisted personal interviewing: an experimental evaluation of data quality and cost, Journal of Official Statistics, vol.11 n.4, pp. 413-431. [5] BARCAROLI G. (1993) Un approccio logico formale al problema del controllo e della correzione dei dati statistici, Quaderni di Ricerca, n.9, ISTAT. [6] BARCAROLI G., LUZI O., MANZARI A. (1996) Metodi e software per il controllo e la correzione dei dati, ISTAT, Documento interno. [7] BARCAROLI G., VENTURI, M. (1997) DAISY (Design, Analysis and Imputation System): Structure, Methodology, and First Applications. In: J. Kovar and L. Granquist, (eds.) Statistical Data Editing, Volume II, U.N. Economic Commission for Europe, 40-51. [8] BERGAMASCO S., GAZZELLONI S., QUATTROCIOCCHI L., RANALDI R:, TOMA A., TRIOLO V., (2004) New strategies to improve quality of ISTAT new CAPI/CATI Labour Force Survey, European Conference on Quality and Methodology in Official Statistics, Mainz, Germany, 24-26 May.

STATISTICAL DATA EDITING – Improving

Quality

299

[9] CHERNIKOVA, N.V. (1964) Algorithm for Finding a General Formula for the Non-negative Solutions of a System of Linear Equations, USSR Computational Mathematics and Mathematical Physics, 4, 151-158. [10] CHERNIKOVA, N.V. (1965) Algorithm for Finding a General Formula for the Non-negative Solutions of a System of Linear Inequalities, USSR Computational Mathematics and Mathematical Physics, 5. [11] COUPER M.P. (1996) Changes in Interview Setting Under CAPI, Journal of Official Statistics, Statistics Sweden, vol.12, n.3 pp.301-316. [12] DE WAAL T., QUERE R. (2003) A Fast and Simple Algorithm for Automatic Editing of Mixed Data, Journal of Official Statistics, Statistics Sweden, Vol. 19, No. 4, 383-402. [13] DECRESSING, J. E FATÀS, A. (1995) Regional Labor Market Dynamics in Europe, European Economic Review, n. 3, 1627-1655 [14] FABBRIS L. (1991) Abbinamenti tra fonti d’errore nella formazione dei dati e misure dell’effetto degli errori sulle stime, Bollettino SIS, n. 22. [15] FABBRIS L., BASSI F. (1997) On-line likelihood controls in Computer-Assisted Interviewing, Book 1, 51° session ISI, Istanbul. [16] FELLEGI I.P., HOLT D. (1976) A Systematic Approach to Automatic Edit and Imputation, Journal of the American Statistical Association, 71, 17-35. [17] FILIPPUCCI C. (1998) La rilevazione dei dati assistita da computer: acquisizioni e tendenze della metodologia statistica e informatica, Sorrento, XXXIX Riunione Scientifica della S.I.S. [18] FUTTERMAN M. (1988) CATI Instrument logical structures: an analysis with applications, Journal of Official Statistics, Statistics Sweden, vol.4, n.4 pp.333-348. [19] FUTTERMAN, M. (1988) CATI Instrument logical structures: an analysis with applications, Journal of Official Statistics, Statistics Sweden, vol. 4, no. 4, pp. 333-348. [20] GIULIANI G., GRASSIA, M. G., QUATTROCIOCCHI L., RANALDI R. (2004) New methods for measuring quality indicators of ISTAT’s new CAPI/CATI Labour Force Survey, European Conference on Quality and Methodology in Official Statistics, Mainz, Germany, 24-26 May. [21] GRASSIA, M. G., PINTALDI F., QUATTROCIOCCHI L., (2004) The electronic questionnaire in ISTAT’s new CAPI/CATI Labour Force Survey, European Conference on Quality and Methodology in Official Statistics, Mainz, Germany, 24-26 May. [22] GROVES, R. M. et al. (1988) Telephone Survey Methodology, New York, John Wiley. [23] KALTON G., KASPRZYK D. (1982) Imputing for Missing Survey Responses, Proceedings of the Section on Survey Research Methods, American Statistical Association, 22-31. [24] KELLER W.J. (1995) Changes in Statistical Techonology, Journal of Official Statistics, Statistics Sweden, vol.11. [25] MASSELLI M. (1989) Manuale di tecniche di indagine: il sistema di controllo della qualità dei dati, Note e Relazioni n. 1, ISTAT.

300

The Effect of Data Collection on Editing and Data Quality

[26] NICHOLLS, W. L., BAKER, R. P., MARTIN, J. (1997) The effects of new data collection technologies on survey data quality. In: Lyberg, L. et al. (eds), Survey Measurement and Process Quality, J. Wiley & Sons, New York. [27] RICCINI MARGARUCCI E., FLORIS P. (2000) Controllo e correzione dati - Manuale utente, Istat, Roma. [28] SANDE, G. (1979) Numerical Edit and Imputation, Proceedings of the 42nd Session of the International Statistical Institute, Manila, Philippines. [29] SARIS W.E. (1991) Computer-Assisted Interviewing, Newbury Park. [30] SAS INSTITUTE INC. (1999) SAS OnlineDoc®, Version 8, Cary, NC: SAS Institute Inc.