Use of Statistical Process Control (SPC) Versus Traditional Statistical Methods in Personal Care Applications

Use of Statistical Process Control (SPC) Versus Traditional Statistical Methods in Personal Care Applications Raymond W. Phillips Dow Corning Corpora...
Author: Shannon Park
6 downloads 0 Views 206KB Size
Use of Statistical Process Control (SPC) Versus Traditional Statistical Methods in Personal Care Applications

Raymond W. Phillips Dow Corning Corporation Midland, MI 48686 USA Presented at the 17th IFSCC International Congress Yokohama October 13-16, 1992

Abstract The basis for traditional “statistical inference” is the assumption of statistical stability. However, observations indicate that this assumption is often incorrect and has significant limitations for the study of skin and hair systems. This paper focuses on the use of analytic statistical analysis versus enumerative statistical analysis for evaluating data from experiments and product studies. Traditional statistical analysis depends on the experimenter to classify random variation (i.e., experimental noise) based on experience. This usually results in an inflated estimate of noise level, which can obscure valuable signals from the study. W. A. Shewhart developed control charts to evaluate data taken during analytic studies; these charts represent a powerful tool for gaining an understanding of experimental variation and results when compared to methods of statistical inference. Analysis of experimental data using the methods of statistical process control (SPC) allows the assumption of statistical stability to be tested. Further, the graphical techniques of SPC allow the experimenter to “see” data, allowing insight into cause-and-effect relationships not readily apparent with traditional statistical methods. SPC techniques allow discovery of sources of non-random variation, resulting in a better understanding of the system being studied.

Introduction According to W. Edwards Deming, the great contribution of control charts is to separate variation into two sources: 1) the system itself (“chance causes,” as Shewhart refers to them), which is the responsibility of management, and 2) assignable causes, which Deming calls “special causes;” those specific to some event that can usually be discovered to the satisfaction of the expert on the job. Shewhart and Deming have pointed out that measurement is itself a process, subject to both sources of variation. Results obtained from two instruments, from a single instrument on different days, or from an instrument operated by different operators, cannot be usefully compared unless the process is “in statistical control”— and statistical control is ephemeral. Control charts are the most useful statistical method for presenting data from laboratory studies.

Traditional statistical studies are enumerative in nature. That is, they are designed to tell “how much” or “how many” and are not intended to be predictive. The conclusions can be related only to the frame or population studied, and extrapolation from the results of an enumerative study to the future are at best misleading. Hypothesis testing and determination of confidence limits based on an assumed distribution are examples of enumerative studies. In contrast, graphical presentation of data in a time order— that is, by a control chart— allows the determination of statistical stability, which gives the basis for predicting future performance of the system being studied. Differentiation of variation into chance causes or assignable causes provides a powerful tool with which to view a “process.” Further, the ability to distinguish between random system noise and non-random assign-l-

Use of Statistical Process Control (SPC) Versus Traditional Statistical Methods in Personal Care Applications

able cause variation provides insights that can remain hidden in enumerative statistical techniques; typically these variations can be seen in designed experiments only if the researchers already know what they are looking for -- and most often that is not the case. Variation caused from subject to subject, day to day, hour to hour, position to position, and of course, treatment to treatment, is easily categorized into these two types using rational subgrouping along with control charts. Shewhart defined an assignable cause of variation as one that can be found by experiment, without costing more than it is worth. As variation in a process is reduced, either by eliminating individual assignable causes, or by changes to the process that reduce random background noise, sources of variation hidden in the previous “noise” of the process become obvious. Continuous refinement of a measurement process leads to increasingly precise measurements.

Applying the Principles to Personal Care Studies of personal care products such as those for skin and hair care are generally analytic in nature. An experiment is devised to measure one or more performance characteristics, and the objective is to predict some future performance of a product or treatment on future tests. The resulting data are analyzed to determine development work or to choose a product that will perform predictably in the future.

1870

Although enumerative statistical techniques have traditionally been used to characterize data of this type, differentiation into the two categories of “noise” and “signal” is not automatic. Hypothesis testing can be used to test for evidence of similarities or differences between potential classes (for example, treatments, time, or hair types). The goal of experimentation is to gain understandable and usable results, and properly designed experiments are powerful tools for finding evidence of assignable causes of variation and interaction between factors. However, these same experiments become cumbersome and fail to reveal the sources of variation if they are not properly designed. Analysis of the results of many powerful statistical techniques is not always intuitive, and the signals in well-designed analysis of variance (ANOVA), fractional factorial, or orthogonal arrays often remain hidden in the somewhat obscure calculations. The graphical techniques of analysis of means (ANOM) and control charts often render obvious what the mathematical techniques and double negatives of confidence tests and other methods obscure. Data presented on a control chart showing the succession of development or study present a complete and up-to-date history of the available evidence for indicating the progress that has been made. Cause and effect relationships are easily identified and quantified by annotating the charts.

1910

1890

1930

Year Figure 1.

Changes in the accepted values for certain universal constants between 1870 and 1940 suggest statistical instability.

-2-

R.N. Phillips

Key to any technique designed to analyze data is the concept of a universe, a population, a frame, or some system of constant causes giving rise to a distribution. The assumption that a universe exists leads to the various tests of the possibility that samples probably came from that assumed universe, or that they did not. A simple example might be the expected distribution from multiple rolls of a pair of dice. A result of thirteen is not an expected outcome, nor are three consecutive elevens likely. Either of these outcomes would signal that the “process” is not a pair of fair dice. Testing of personal care products follows the same line of reasoning. All the sources of variation broaden the distribution of expected outcomes, making it more difficult to see the effect of a treatment or the differences between treatments. Use of control charts not only helps researchers identify and eliminate much of this noise, but it leads to a better understanding of the system of causes in hair and skin care products. The assumption of statistical stability implies an underlying universe, a system of chance causes with the absence of assignable causes of variation. It should not be made lightly. For example, Shewhart looked into the assumption of stability and the phrase “essentially the same conditions” as used by scientists to mean that a system or process has statistical stability. As Figure 1 illustrates, in the graph of three fundamental physical constants, Shewhart found that the speed of light

(C), the gravitational constant (G), and Planck’s constant (h) did not appear as though they came from a stable system. Certainly these measurements are among the elite of all measurements; yet it appears that sources of nonrandom variation existed at the time of their measurement, and that today’s scientists are left to use their own experience to determine the cause. To avoid problems of this nature, a Range chart or standard deviation chart is used in control charts to assess the stability of variation and also to quantify the level of background noise, which is then used to differentiate between levels of treatment.

An Example: Conductance Studies Figure 2 is a Range chart for conductance measurements on blank, or untreated, skin. Ranges greater than the upper control limit signify variation in excess of that expected from a stable system of chance causes. The right arm is less predictable than the left; only three signals of non-random variation occurred on the left arm, while 17 occurred on the right. Therefore, the assumption of stability is less likely to be true on data from the right arm, and signals of treated skin on the right arm have a higher potential of not coming from the treatment studied, but rather from some unidentified source. This phenomenon had not been previously noted by

60

Right-arm data

Sample

Figure 2.

Range charts help identify potential instability in experimental data.

-3-

I

Use of Statistical Process Control (SPC) Versus Traditional Statistical Methods in Personal Care Applications

the researchers, and the source of variation was aggregated as “background” when T-tests were used to determine significance between treatments. Because the cause remained undetermined, right arm data were eliminated from the study, and future work is continuing on the left arm only. Focusing on the left arm has greatly improved the ability to differentiate between treatments. A Range chart on blanks is now used to screen subjects, eliminating this source of variation from current testing. This improves the precision of conductance tests and enhances the ability of researchers to differentiate between treatments. A full analysis of the moisturization data from both transepidermal water loss (TEWL) and conductance has been published [1] and will not be described in detail here. Figure 3 demonstrates how data can be rationally subgrouped by combining all blanks from the left arm by position as a means of seeking relationship, i.e., blank TEWL measurements by position on the left arm. The control limits signify the amount of variation that can be explained by measurement precision. The graph shows a linear increase of blank TEWL proportional to the position on the forearm. This shows not only the significance of the differences that are not explainable by test-retest variation, but the relative magnitude of TEWL upward along the arm.

Clearly, measurement of treatment effects must be corrected for positional influences, a step that helps increase the precision of the data analysis. Traditional statistical treatment of the data would use randomization of sites to eliminate the influence of site on treatment. However, an approach of this type aggregates what is an assignable cause into the realm of background noise, thus dulling the analysis.

The Use of Histograms Histograms are powerful analytical tools that are often overlooked. Elton Trueblood has said, ‘There are people who are afraid of clarity, because they fear it may not seem profound.” Presenting data in the form of histograms can result in a clear picture of the knowledge contained in the data. In laboratory studies, data are generated as evidence for various types of inferences. Shewhart proposed a primary rule for the presentation of data: to be useful, original data should be presented so as to preserve the evidence in the original data for all the predictions assumed. He warned that for statistical control, it is not sufficient that the data be taken under “presumably the same conditions.” Histograms can reveal lack of statistical stability; that is, they may show evidence of different underlying populations. Figure 4 is a three-dimensional histogram depicting

Upper control limit

Left-Arm Values

Figure 3.

Creating subgroups of data can help identify non-random variability.

-4-

R.N. Phillips

treated, then evaluated by a panel for various qualities. The researchers then “analyzed” the results via a user-friendly computer package with ANOVA capabilities. ANOVA is a powerful statistical technique used to detect signals in experimental data. However, the result is “a significant F test,” rather than results that can be understood and used. The results of the significant F test show only that a signal exists in the data, they do not indicate what the signal is. Further analysis is required to determine which treatments differ from each other; the original researchers were not aware of this— and the computer failed to tell them. They assumed the cause of the signal based on their subject knowledge, reinforced with a significant F, and concluded that C differed from A or B, which made sense from a technical standpoint. Another researcher analyzed the data using histograms of the original data, then questioned the conclusion. Clearly, C differed from A, but the bimodal nature of B was lost in the ANOVA technique. In fact, B could not be distinguished from C or A using statistical tests, but the bimodal distribution of B was a signal of the performance of treatment B, which led to further insight into the performance of all three hair treatments. The histogram revealed to the second researcher the evidence of a different system of causes contained in

the reduction in conductance measurements of petrolatum-treated skin over time. If only random variation existed, measurements on untreated skin would result in a distribution centered at zero and extending to ± three standard deviations. However, Figure 4 depicts not only the decrease in time in conductance measurements on treated skin, but also suggests the presence of other assignable causes of variation contained in the data by the spread of the histogram at 2, 4, and 6 hours after treatment. Both the level of change and the variation of the treatment can easily be “seen” by the researcher, and the effect is not obscured by the language of mathematics or other analysis. Clearly, the true power of statistical analysis comes from the partnership of the scientist and the statistician: The statistician brings knowledge of variation and data presentation to the knowledge of the scientist, who is an expert in the particular subject. Thus, the evolution of knowledge begins with the scientist and ends with the statistician, and in between, the two must cooperate. In many instances, the analysis of hair treatments is subjective. Panels are asked to quantify the effects of various treatments on samples of hair, typically rating them from 0 to 10. In a recent study, researchers set out to measure the effect of three hair treatments. Samples of untreated hair were collected and randomly

0

-2 -1

0

1

2

3

4

5 6

7

8

9 10 11 12 15 19 20 25 26 27 33

Standard Deviation Units Figure 4.

Three-dimensional histograms can help scientists “see” both the level of change and the variation in treatment without complex analysis.

-5-

the original data, which was obscured in the data summarized by the first group of researchers.

Conclusions Statistical analysis is necessary when variation exists; it is not necessary to tell researchers what they already know. Caution should be exercised, however, with assumptions of stability, and when experts render opinions regarding “essentially the same process.” Simple graphical presentation, including control charts, should be used to supplement the experts’ view of experience. As scientists, we generate data to acquire knowledge; analytical studies are conducted with the aim of prediction. Shewhart pointed out that to every prediction there corresponds a certain degree of rational belief; that is the function of control limits. Deming warns that the statistician’s levels of significance furnish no measure of belief in a prediction. Probability has use; in analytic studies, tests of significance do not. The acquisition of knowledge is a continuing process, and control charts can give a picture of this process. They also provide the necessary degree of rational belief in the predictions.

DOW CORNING CORPORATION MIDLAND, MICHIGAN 48686-0994 Printed in the U.S.A. on recycled paper.

Form No. 25-270-92

References 1. Malczewski, R.M., and R.W. Phillips, ``The Use of Statistical Process Control to Analyze Moisturization Study Data,” presented at the 16th International Congress of the IFSCC, New York, NY, 1990.

For Further Reading Shewhart, WA., Statistical Method from the Viewpoint of Quality Control, Dover Publications, Inc.,

1986. Shewhart, W.A., Economic Control of Quality of Manufactured Product, Van Nostrand, 1931. Wheeler, D.J., Understanding Industrial Experimentation, Second edition, SPC Press, 1990.

Author Raymond W. Phillips is a consultant in statistical methods for Dow Corning and is the company’s primary internal SPC consultant. He holds a B.S. degree in chemical engineering, has pursued graduate studies in chemical engineering and engineering management, and is a registered professional engineer in Michigan, USA.

-

Suggest Documents