Organic and inorganic parameters in contaminated sites. Dangerous goods and hazardous substances

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright ° 1 Round Robin Test Name and Name Modifications: Proficiency Test (PT)...
Author: Ralf Rich
2 downloads 0 Views 138KB Size
c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

1

Round Robin Test Name and Name Modifications: Proficiency Test (PT) Round robin tests or proficiency tests are an essential element of quality assurance for laboratories. The objective of proficiency testing is to ensure the quality and comparability of measurement results by the laboratories. Comparability is no longer guaranteed if the laboratories get very different results in the analysis of identical samples. To prevent such a situation, periodic laboratory proficiency testing are performed to ensure the ability of laboratories to provide sufficiently accurate results. A round robin test is based on identical samples which are sent to the participating laboratories which use agreed methods of analysis. The samples are from an institution that conducts the trial and invites the laboratories to participate. This institution is also responsible for the evaluation. For the various types of laboratories there are various professional organizations that are responsible for carrying out the laws and regulations required proficiency testing. The implementation of proficiency testing itself takes place in accordance with international and national standards. For example, the Federal Institute for Materials Research and Testing (BAM) offers round robin test in the following areas1 : • Organic and inorganic parameters in contaminated sites. • Dangerous goods and hazardous substances. • Chemical emissions from materials and products in the air. • Porous and disperse materials. • Particle sizes of fine powders. The BAM also provides the English language proficiency test-information system EPTIS, which offers detailed information about worldwide programs of round robin tests. Statuary Basis 1

See http://www.bam.de/de/fachthemen/ringversuche/index.htm, February 2013.

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

2

Accredited laboratories must participate periodically in proficiency testing to thereby prove their quality. The study to be performed are legally defined, e.g. in the area of contaminated sites in Germany by the Federal Soil Protection and Contaminated Sites Ordinance (BBodSchV) of 12 July 1999, and the “Requirements for sampling, sample preparation and chemical investigation methods on federal property” (updated version of October 2008). Laboratories which are required to participate must have their methods of measurement reviewed within the scope of the existing or applied for accreditation (parameters). Proficiency Tests as External Quality Assurance Proficiency tests are considered as a tool of external quality assurance. Depending on the target there are round robin tests to standardize procedures, to proof the Laboratory capability, to determine certain characteristics of reference materials and to test the practical application of analytical methods. The anonymous comparison of the results is considered as a major advantage for the participating laboratories which get a very good positioning for their laboratory. Proficiency tests are used in addition to individual proficiency testing and laboratory comparisons to determine the performance of a laboratory by comparison tests. A round robin test provides an independent assessment unlike single proficiency testing in which no comparison with other laboratories is possible, or laboratory comparisons which may not be neutral. Conduct and Interpretation of Laboratory Data as Part of a Round Robin Test There are no standard guidelines for data evaluation in a round robin test. The different relevant standards suggest many apparently conflicting procedures. Therefore, the implementation of a round robin test and the analysis of the data shall be illustrated here by the example of a proficiency test organized by the AQS Baden-W¨ urttemberg for the operational analysis of wastewater treatment plants. Such round robin tests are carried out on behalf of the Ministry of Environment, Nature Conservation and Transport Baden-W¨ urttemberg2 . 2

See http://www.iswa.uni-stuttgart.de/ch/aqs/rv/rv_allgemein.html, February 2013.

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

3

• Organization: The participating laboratories receive three identical samples in which the various components are contained in different concentrations. The AQS BW sets, possibly after preliminary investigations and in consultation with the other proficiency testing agencies, the necessary stabilization measures, container materials and the concentration levels. The lowest concentration is chosen so that they can still be detected reliably with at least one of the methods described in the relevant standards. The concentration range is selected so that the monitoring limit value is included in the lower region, and the highest concentrations correspond to what is found in control samples in the reality. The round robin test is then announced to all registered laboratories stating the methods for several weeks prior to the scheduled shipment date. To register for this round robin test, a fixed dead line is set. In the Institut f¨ ur Siedlungswasserbau control stock solutions for all parameters and all concentration levels are produced at great expense. The samples are prepared by means of a fixed system of dilutions of these stock solutions. The amount of the sample is such that it is sufficient on the one hand for multiple determinations, but on the other hand it is so small that there is no possibility for performing excessively large tests series in the laboratories. The samples are packaged with polyurethane moldings in appropriate boxes and usually dispatched with an express package service. The samples are thus usually the next day in the laboratory, so that the investigation can begin in the same week. In some cases, if a continuous cooling of the sample must be ensured, the samples are brought by refrigerated vehicles to remote distribution points and there are handed over directly to the laboratories. The packages contain a cover letter accompanied by a leaflet, which draws attention to common mistakes. This leaflet should serve as a checklist before the results are delivered. The results may be delivered electronically via Internet or by completing sending the enclosed result sheets. During data entry, via the Internet or in the result sheets (if not already registered) the address of the laboratory must be entered, the used methods, sample numbers and the

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

4

resulting values using the required unit. The specification of the used method offers the possibility of a specific analysis. The result sheets or the printed Internet reports must necessarily be returned in time and signed to the Institut f¨ ur Siedlungswasserbau. For the organization of such a round robin test it is very important that all deadlines are met by the participants. This is particularly required for providing samples in sufficient numbers. For organizational reasons, sometimes samples are left for single latecomers. However, this can, for obvious reasons, not guaranteed. Moreover, the time available for analysis is very scarce, so that the AQS BW must insist on a timely submission of results. Following the evaluation, the organizer evaluation sheets created for all laboratories is delivered together with the evaluation documentation. These revaluation sheets include the values submitted by the laboratory together with the target values. Values which lie within the tolerance limits, are marked as successful with a “+”. • Evaluation: The proficiency tests of AQS Baden-W¨ urttemberg are evaluated following according to DIN 38402 - A45 evaluated. The evaluation is based on the deviation from a “conventional correct value” using the so-called zU -scores3 . As a first step, the standard deviation sR is calculated for each concentration level. This is done using the Q-method4 , a method of so-called “robust statistics”. This value is required, among other things, for the calculation of the “robust mean” using the Hampel estimator.5 This “robust mean” that is usually considered as “conventional true value” i.e. as target value. In justified cases, the test investigator may select 3

Transformed values are often called scores. The transformation of the aforementioned zU -scores has the aim to take into account a possible skewness of the distribution function. 4 The Q-method (a type of factor analysis) has been developed by British psychologist William Stephenson (1902-989) and is used in the psychology and the social sciences to check the “subjectivity” of the test persons - from their own perspective. What the Qmethod has to do in connection with the determination of a standard deviation, is not clear. 5 The Hampel estimator is recommended when it is expected that the sample contains so-called outliers. The Hampel estimator belongs to the scale-invariant M-estimators. The estimator is determined by the setting of three constants, and provides for the weighting of the sample elements. The estimate can be calculated only by an iterative algorithm.

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

5

the “conventional true value” differently for example the sample weight value or other reference values. For each measured value, first, a z − score is calculated according to the following formula: z − Score =

measured value − set value target standard deviation

The target standard deviation can differ from the calculated comparison standard deviation when the following cases occur: a) There are fixed upper and lower limits for the comparison standard deviation. If these are exceeded or undercut the target standard deviation is selected by the exceeded upper or undercut lower bound for the comparison standard deviation which are fixed in advance. b) In the case of a concentration comprehensive evaluation by means of the variance function which is described in DIN 38402 - A45. In such a case the target standard deviation is the value of the variance function for the corresponding concentration value. A combination of the two cases may also occur. To compensate for the injustices occurring at low concentrations skewed distribution of data, the zU -scores are calculated from the z-scores as described in DIN 38402 - A45. By its zU -scores each measured value is assessed. Values between -2 and +2 are considered by definition to be acceptable. More differing values are evaluated as “false”. In the round robin tests of PT-WFD network, values in the range 2 < |z − score| < 3 are as considered as “questionable”. — The assessment of each individual value is marked on the evaluation sheet with a “+” or “-” otherwise. Further parameter-specific and overall ratings in round robin tests in wastewater and drinking water systems are regulated differently and therefore the specific guidance is given for each of the relevant round robin tests.

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

6

Fixing the Proficiency Test Parameters About the choice of the number M of concentration levels, the number L of participating laboratories and of the number K of the parallel determinations the following recommendations can be found in the standard DIN 38402 part 41: • The number L of the laboratories should depend in a certain way on the number M of the levels. It is recommended that the number of laboratories is not less than L = 8 and that a larger number of laboratories is preferable (for example, L = 15 or more), if there is only a single level of interest. • K = 4 parallel determinations are recommended, unless it is usual to perform a larger number of parallel determinations. When faced with very high level of spread of analytical results between laboratories can sometimes also be a smaller number of parallel determinations appropriate but never less than 2. It is to ensure that the product of the number L of the laboratory and the number K of parallel provisions is not less than 24. Note: The standard gives no reasons for the recommended numbers. The main aim of a round robin test is to evaluate the measurement capability of laboratories. Why the number L of participating laboratories should depend on the number M of prescribed levels remains unclear. The recommendation to reduce the number K of parallel determinations in the case of greater variability seems to be at least strange. Finally, the requirement L · K ≥ 24 is very mysterious. The Conventional True Value The so-called “conventional true value” is a particular problem for many proficiency tests. In the standard 38402 part 41 there are three ways indicated to set this value: • For chemically well defined substances in synthetic samples, the conventionally correct value is identical to the “true” content of the substance in the sample. • In case that a theoretical substance-related definition of the “true” value is not possible, or the “true” value is not known, because exceptionally no synthetic sample was placed in the round robin, the conventionally correct value is often determined by a recognized reference method. The

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

7

value determined by this method is then agreed to be the conventionally correct value. • If also this way is not possible, for example because the round robin test is used to test a specific method for its usefulness as a reference method, or because the production of synthetic samples with “true” values fails, or because there is no other way to get a reference value, then as a last way out, the common mean of all values reduced by so-called outliers defines the conventionally true value. This conventionally true value is not necessarily identical with the “true” value. Note: First of all, it should be clear that the “true” value can never be known, regardless of whether it is a synthetic sample or not. While the first two methods to determine a “conventional true value” describe possibilities to get a useful approximation of the true value, the third option resembles a lottery. In the first two cases, the aim of the round robin test is to verify the ability of laboratories, while in the third case, the goal may also be to test a so-called reference measurement method. The two goals are fundamentally different and therefore can not be achieved using the same method (round robin test). Critique The evaluation of the results of a round robin test consists essentially of the determination of so-called tolerance intervals for the different measurement methods used in the laboratories. This is accomplished by the determination of the comparison and the target standard deviation, or in other cases, of the measurement uncertainty, where in general the evaluation is based on the normal distribution, and only in special cases possible skewness (asymmetry) of the distribution is considered. Whether a measurement method or a laboratory provides useful measurements depends only on the uncertainty of the measurement process. Each measurement method is based on a correlation between the quantity of interest and an observed quantity (result of the measurement process). Measurement uncertainty is thereby completely determined by the probability distribution of the observed quantity, which depends on the unknown value of the quantity of interest and the prevailing conditions in the laboratory. If the values of the observed quantity for repeated measurements vary strongly, then measurement uncertainty is large, if the values vary little, then measurement uncertainty is small. To evaluate the capability of a measurement procedure in a given laboratory, it would be therefore necessary to determine

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

8

the corresponding measurement uncertainty and then compared it with the uncertainty of the measurement method under specified conditions. Instead, with complicated statistical methods on the basis of fairly arbitrary distributional assumptions and based on the data obtained by the round robin test, standard deviations are calculated and 2- or 3-sigma intervals and the required tolerance intervals are determined. Below this practice is briefly evaluated: : • Normal Distribution: In many cases, it is proven that the observed quantity has a normal distribution. Such evidence can not be performed because there exist in this world no normally distributed variables. If nonetheless the above statement is made then it reveals a lack of understanding of stochastics. • In the evaluation, only the (symmetric) normal or skewed (asymmetric) distributions are considered. However, merging the values of all laboratories not symmetry or asymmetry is important, but the fact that the considered random variable is no longer uni-modal. • Since one can assume that the people participating in a round robin test of laboratories will perform the measurements very carefully the legitimacy of the use of robust methods and the identification and exclusion of so-called outliers is at least questionable. • About the so-called “conventional true value” it is only known that the true value is unknown. How wrong it actually is cannot be determined with the calculated standard deviations. • To account for the uncertainty realistically, would require stochastic methods of measurement. These do not provide a single measurement from which it is only known that it is wrong, but a measurement interval that contains the true but unknown value of the quantity of interest with fixed procedure’s reliability. If such stochastic measurement methods would be applied it would be possible to interpret and compare the measurement results of the laboratory. Relevant Standards and Guidelines For the implementation and evaluation of proficiency testing, there are several requirements in standards and legislation (eg Medical Devices Act). The key standards are given in the following list:

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

9

• ISO/IEC 17011: Conformity assessment – General requirements for accreditation bodies accrediting conformity assessment bodies. • DIN EN ISO/IEC 17025: Allgemeine Anforderungen an die Kompetenz von Pr¨ uf- und Kalibrierlaboratorien. • DIN EN ISO 17043: Konformit¨atsbewertung – Allgemeine Anforderungen an Eignungspr¨ ufungen. • DIN ISO 13528: Statistische Verfahren f¨ ur Eignungspr¨ ufungen durch Ringversuche. • DIN ISO 5725 (Normen-Reihe): Genauigkeit (Richtigkeit und Pr¨azision) von Messverfahren und Messergebnissen. • ISO/IEC Guide 43-1: Proficiency testing by interlaboratory comparisons – Part 1: Development and operation of proficiency testing schemes. • ISO/lEC Guide 43-2: Proficiency testing by interlaboratory comparisons – Part 2: Selection and use of proficiency testing schemes by laboratory accreditation bodies. • DIN-Taschenbuch 355 • DIN 384026 • DN 38403 Teil 41 (1984): Deutsche Einheitsverfahren zur Wasser-, Abwasser- und Schlammuntersuchung; Allgemeine Angaben (Gruppe A); Ringversuche, Planung und Organisation. • DIN 38402 Teil 42 (1984): Deutsche Einheitsverfahren zur Wasser-, Abwasser- und Schlammuntersuchung; Allgemeine Angaben (Gruppe A); Ringversuche, Auswertung. • DIN 38402 Teil 45 (2003-09): Deutsche Einheitsverfahren zur Wasser-, Abwasser- und Schlammuntersuchung; Allgemeine Angaben (Gruppe A); Ringversuche zur externen Qualit¨atskontrolle von Laboratorien. 6

The DIN 38402 standard covers several standards that deal with the analysis and proficiency testing in the context of water, waste water and sludge. For example DIN 38402 Part 71 treats the thopic of “equivalence of two methods of analysis based on the comparison of the test results on the same sample (same matrix).”

c by Stochastikon GmbH (http: // encyclopedia. stochastikon. com ) Copyright °

10

References and Literature: • ASTM E1301-95: Standard Guide for Proficiency Testing by Interlaboratory Comparison. • GUM (1995): Guide to the Expression of Uncertainty in Measurement. BIPT. • W. Horwitz (1982): Evaluation of Analytical Methods Used for Regulations of Food and Drugs. Anal. Chem 54, 67A-76A. • St. Kromidas (Hrg.) (2011): Handbuch Validierung in der Analytik. (2nd edition), Wiley-VCH, Weinheim, Germany. • P.J. Lowthian und M. Thomson (2002): Bump-hunting for the proficiency tester – searching for multimodality, Analyst 127, 1359-1364. • S. Riedel (2004): Erprobung neuentwickelter Schwingungsmodelle des sitzenden Menschen mittels Round-Robin-Test. Schriftenreihe der Bundesanstalt f¨ ur Arbeitsschutz und Arbeitsmedizin. Forschung Fb 1029. • M. Thomson (2000): Recent trends in inter-laboratory precision at ppb and sub-ppb concentrations in relation to fitness for purpose criteria in proficiency testing. Analyst 125, 385-386. • M. Thompson, S.L.R. Ellison and R. Wood (2006): The International Harmonized Protocol for the Proficiency Testing of Analytical Chemistry Laboratories. (IUPAC Techn. Rep.), Pure and Applied Chemistry 78, 145196. Author(s) of this contribution: Karl Baur

Version: 1.00