Welcome to Physical Sciences 2 lab!

Welcome to Physical Sciences 2 lab! We're very excited about the labs for this course, and we hope you will be, too. Everything about the labs has bee...
Author: Dwayne Dean
0 downloads 1 Views 786KB Size
Welcome to Physical Sciences 2 lab! We're very excited about the labs for this course, and we hope you will be, too. Everything about the labs has been newly designed for a great educational experience with a minimum of annoying busywork. We've had a lot of fun working on the labs, and hopefully, you will have a lot of fun doing them. By this time you should all have sectioned for a lab time assignment. If you haven't, or if you don't remember your lab section time, please contact Kirill immediately: [email protected]. Lab 1 will run next week, from Tuesday, October 3 to Thursday, October 5. Before you show up to your first lab next week, we would like you to do three things:

1. Download the Logger Pro Software. Logger Pro is the data collection and analysis software we will be using for all of the labs in this course. It's very easy to use and powerful, and is available both for Windows or Macintosh platforms. The site license agreement allows any Harvard student to freely download and use the software. (If you don't have a PC or a Mac, or don't want to put Logger Pro on your own computer, you can use one of the computers in the Science Center computer labs.) The program can be downloaded from the HASCS Software Download Page: http://www.fas.harvard.edu/computing/download. Either version 3.4.5 or 3.4.6 is okay; 3.4.6 is the latest; as of this writing, 3.4.5 is the version available from HASCS, but we're told they are working on getting 3.4.6 up there. 2. Learn to use Logger Pro. We recommend you go through some of the tutorials that come with the software. To do so, go to File-Open and then under the folder labeled Experiments, find the subfolder called Tutorials. Tutorial #1 is a quick overview; #5 has information on entering data; #7 is a very brief summary on working with graphs; and #9 teaches you how to analyze data using curve fitting. Some of the other tutorials are also useful, but they require one or more sensors connected so that you can learn how to take data. 3. Read the attached handout, "An Introduction to Measurement and Uncertainty." This document contains ideas which will be new to many of you, even those of you with a background in statistics, but we have essentially tried to boil down the most important things you need to know about doing quantitative experimental science and put them in one place, so it is very important; we will be using the ideas from this document over and over throughout the labs this semester. If you have any specific questions about the document, please post your questions to the Lab Discussion Page on the course website, or contact your Lab TF. That's it! We look forward to a semester of fun, excitement, and instruction in the labs. See you next week!

Physical Sciences 2 and Physics 11a An Introduction to Measurement and Uncertainty 1. Measurement and Uncertainty In the laboratory portion of this course, you will perform experiments and make observations. You should distinguish between two types of observations: qualitative and quantitative observations. Although qualitative observations are an important aspect of experimental science (e.g., “I connected the battery and smoke started pouring out of the device”), we will focus on quantitative observations, or measurements. You will make measurements using various measuring devices, and report the values of these measurements. Physical theories, such as Newton’s laws of motion, make quantitative predictions about the outcomes of experiments: if we drop a ball from a height h above the ground, Newton’s laws predict the speed of the ball when it strikes the ground. In order to test, refine, and develop our physical theories, we must make quantitative measurements. Although you make measurements every day—after all, a clock is a device that measures time—you probably do not give much thought to the process of measurement. The following schematic should help you think about this process: The physical system (what we measure)

The measuring device (takes a measurement)

The measurement (what we record)

is described by certain parameters, such as position, time, velocity, mass, force, etc.

could be a stopwatch, ruler, balance, thermometer, etc. The experimenter may be "a part of the device."

must have three things: • numerical value • estimated uncertainty • units

Within the paradigm of classical physics, we consider the parameters of the physical system to be defined to infinitely high precision. Any measuring device, however, has some limits on the precision of its measurements. For instance, you may measure time using a digital stopwatch that records time to the nearest millisecond. A measuring device observes a physical system and records a measurement. When you measure length using a ruler, the ruler alone is not a complete measuring device: you must interpret the markings on the ruler and record the measurement, so you are a part of the 1

measuring device. A thermometer connected to a computer is a complete measuring device, since the computer records the measurements. All measurements involve some uncertainty, or error. Physicists use the term error not to describe mistakes (“I dropped the thermometer and it broke”) but to describe the inevitable uncertainty that accompanies any measurement. When we report a measurement, we must include three pieces of information: the numerical value of the measurement, the units of the measurement, and some estimate of the uncertainty of the measurement. For example, you might report that the length of a metal rod is 13.2 ± 0.1 cm. In the first lab activity of this course, we will try to explore exactly what is meant by uncertainty. We distinguish between two types of error in measurement: systematic error and random error. The following illustration shows examples of these two types of error: "true" value

measured values large systematic error small random error

small systematic error large random error

The set of measured values on the left exhibit a large systematic error: they are all lower than the true value of the parameter. The set of measured values on the right exhibit a small systematic error: they are, on average, neither higher nor lower than the true value of the parameter. However, the measured values on the right have more random error than those on the left: they vary more from one measurement to the next. You may have heard the terms precision and accuracy used to describe measurements. A measuring device that has very little systematic error is said to be accurate: its measurements should, on average, be equal to the true value. A measuring device that has very little random error is said to be precise: repeated measurements of the same parameter should not vary much from one measurement to the next. In principle, you can eliminate systematic error from your measurements by calibrating your measuring device. If you measure a standard object or system that has a known value for the parameter of interest, you can determine the sign and magnitude of the systematic error of your device and compensate for that error in your measurements.

2

For instance, you can use a mixture of ice and water at equilibrium (which will have a temperature of 0°C) as a standard reference point to calibrate a thermometer. Ideally, you should calibrate a measuring device at several different points over its range. A proper laboratory experiment should always check for the possibility of systematic error and compensate for that error by calibration. You can never eliminate random error from your measurements. Electronic measuring devices, for instance, suffer from various sources of electronic noise. All devices suffer from thermal fluctuations. Errors made by the operator of a device (“human error”) can be both systematic and random. For instance, if you measure the time of an event by pressing a button on a stopwatch, you are likely to press the button somewhat after the event has actually occurred (a systematic error), and the amount that you are late is likely to vary from one measurement to the next (a random error). In the preceding discussion, we have implicitly introduced the concept of making repeated measurements. You should ask: what does it mean to repeat a measurement? Often, a physical system will not “sit still” and wait for us to make repeated measurements. If we want to drop a ball from a height h and measure its velocity when it strikes the ground, we can probably make only one measurement of the velocity at that instant. Instead, we repeat the experiment using identical starting conditions and make a measurement of each experiment. In this case, we could take the same ball and drop it again from the same height h. As you might expect, this procedure introduces some error because we can never exactly reproduce the conditions of a particular experiment. We can control a small number of parameters (e.g. the mass of the ball, its initial height) but cannot control many other parameters (e.g. the velocity of every molecule of air in the room). Because our world is ultimately governed by quantum mechanics, we cannot even in principle control all the relevant physical parameters of a given experiment! We must, therefore, consider what parameters are likely to have a significant effect on our experiment and control those parameters to the best of our ability. 2. Repeated Measurements and Statistical Distributions The “gold standard” of any physical experiment is to perform a huge number of measurements on repeated experiments with identical starting conditions. This procedure 3

would yield not one measurement, but a statistical distribution of measurements. We can report a statistical distribution using a histogram. For instance, 50 repeated measurements of the velocity at the moment of impact of a particular ball dropped from a particular height might yield the following histogram:

The x-axis of a histogram shows the values of the measured parameter, divided into bins of equal width; the y-axis shows the frequency, or number of times that a measured value fell within a particular bin. In the histogram shown above, the bins are centered around the values shown on the x-axis; the width of each bin is equal to 0.05 m/s. A histogram is the best way to report the results of repeated measurements of a parameter: one can see immediately the overall shape of the distribution, the mean (or average) of the distribution, and whether there are any notable statistical outliers (values that fall unusually far from the mean). Obviously, it would be unwieldy to publish a histogram for every measured parameter in every experiment. Usually, we fit an idealized distribution to the measured histogram and report a few parameters that characterize the idealized distribution. In most cases, we can fit a normal or Gaussian distribution to the histogram. The normal distribution is characterized by two parameters: the arithmetic mean (often symbolized by the Greek letter µ or the symbol x ) and the standard deviation (often symbolized by the Greek letter σ). An approximate formula for the Gaussian distribution (for histograms containing a total of N!measurements with bins of width w) is:

4

Expected Gaussian frequency for bin centered around x "

&%(x % µ) 2 ) Nw exp( + 2 # 2$ ' 2# *

Here is the above histogram along with a Gaussian distribution calculated from the !

arithmetic mean and standard deviation of the measured data:

As you can see, the Gaussian distribution offers a reasonable approximation to the experimental distribution. Indeed, most experimental measurements yield histograms that are approximately Gaussian. Several features of the Gaussian distribution make it particularly useful in describing and analyzing experimental data. This distribution is characterized by only two parameters: the mean (µ) and the standard deviation (σ). For a Gaussian distribution, the mean (the arithmetic mean), the median (the “midpoint” of the data) and the mode (the highest point, or the most common result) are all identical:

5

The standard deviation (σ) gives a measure of the spread or “width” of the distribution. Another common measure of the spread of a distribution is the full-width at halfmaximum, or FWHM, which is exactly what it says: the full width of the distribution at the midpoint between the baseline and the peak of the distribution:

The standard deviation σ of a Gaussian distribution is related to the FWHM by the following equation:

"=

FWHM FWHM # 2.35 2 2 ln 2

You can use the standard deviation to estimate how many measurements will fall within a certain “distance” of the mean. The general rule (often called the “68–95–99.7 rule”) ! states that: 68% of the measurements should fall within 1 std. dev. of the mean 95% of the measurements should fall within 2 std. dev. of the mean 99.7% of the measurements should fall within 3 std. dev. of the mean We can understand the meaning of this rule by examining the area under the Gaussian curve within these limits:

6

Thus, knowledge of the standard deviation (which can be derived from a statistical analysis of the data, from fitting a Gaussian curve to a histogram, or from the FWHM of 7

the distribution) allows you to estimate the probability that a measurement will fall within a certain range of the mean. This can be useful in deciding whether to eliminate a statistical outlier from your data. If your measuring device usually yields a Gaussian distribution of measurements, and you see a measurement that is, for instance, 4 standard deviations away from the mean, you may want to reject that measurement as an outlier. You should also analyze your experimental setup and your measuring device to see if you can determine why that measurement was erroneous. 3. Normally, Everything is Normal: The Ubiquitous Gaussian Distribution In nearly all cases, the random error in any set of repeated measurements leads to a distribution of measurements that is approximately Gaussian. Why is this distribution so common? In statistics, this distribution is called the normal distribution: data that follow this distribution are said to be normally distributed. We can understand why this distribution arises using an important result from statistics known as the central limit theorem. The central limit theorem says that if you add together an infinite number of uncorrelated random variables—with the stipulation that each random variable must have a mean of zero and a finite standard deviation—the result will be a Gaussian distribution. Let’s think about this for a minute. First, we require that the variables be random and uncorrelated (not correlated with one another). Those requirements should be intuitively obvious. Next, we require that each variable must have a mean of zero. That is another way of saying that the random variables should not introduce any systematic error: on average, each random variable should not add or subtract anything to the sum. Finally, we require that each random variable have a finite standard deviation (stated more often as the requirement of a finite variance, which is simply the square of the standard deviation). Any random variable that has an infinite standard deviation would be unbounded, which poses a challenge to our intuitive notion of randomness: what would you do if someone told you to pick a random number between one and infinity? (It is mathematically possible to have an unbounded random variable with a finite standard deviation—indeed, the Gaussian distribution is an example—but all physical random variables will be bounded by some limits.) As long as those requirements are fulfilled, 8

the sum of all the random variables will approach a Gaussian distribution as the number of random variables approaches infinity. This theorem places no other requirements on the distribution of each random variable. For instance, a sum of an infinite number of bimodal distributions will yield a single Gaussian distribution. How is the central limit theorem related to uncertainty in physical measurements? In any experiment, there will be many sources of error: electronic noise, operator error, thermal fluctuations, etc. We assume that any systematic error has been eliminated by proper calibration of the measuring device. Thus, the average error introduced by all of these various sources should be zero. We expect that these sources of error are uncorrelated, and they must be bounded by some physical limits, so they will have a finite standard deviation. Finally, we assume that these sources of error are additive: that is, the total error is the sum of each of the individual sources of error. As long as there are a large number of such sources of error, the total distribution will approximate a Gaussian distribution. Any experiment that yields a non-Gaussian distribution probably has some source of systematic error or some hidden correlation between the random sources of error. We should note that we are considering physical measurements in which the uncertainty of measured values arises from random errors in the measuring device, not from variations in the “true” value that is measured. Within the realm of classical physics, we assume that the “true” value of any physical parameter has no uncertainty and that all of the uncertainty arises from the process of measurement. Thus, the measured distributions are nearly always Gaussian. In many other applications of statistics, however, the underlying parameter may exhibit intrinsic variance and a notably non-Gaussian distribution. For instance, the distribution of family incomes in the United States is highly non-Gaussian: the vast majority of families have moderate incomes, but there is a long “tail” that extends up to very high incomes. Such distributions are said to be skewed. Under most circumstances, highly skewed distributions will not result from random measurement errors. The central limit theorem properly applies only in the limit of an infinite number of random variables. If one examines how the sum of a finite number of random variables converges on a Gaussian distribution, one observes that the central part of the 9

distribution converges quite rapidly, but the “tails” of the distribution converge more slowly. Although the Gaussian distribution is mathematically unbounded, you should not take the extreme tails of this distribution seriously: in the “ball drop” experiment, for instance, a literal interpretation of the Gaussian distribution would suggest that there is a non-zero probability of measuring a negative velocity, or a velocity faster than the speed of light. Likewise, although a graph of the heights of adult women shows an approximately Gaussian distribution, a literal interpretation of this distribution would suggest that there is a non-zero probability of finding an adult woman who is 100 feet tall. As far as physical measurements are concerned, you should regard the central limit theorem as a statement that the middle of a measured distribution should look approximately Gaussian. 4. Repeating Measurements: Standard Deviation and Standard Error Ideally, you would repeat every measurement enough times to plot a histogram and confirm that the distribution is indeed Gaussian. In reality, though, such a procedure would be unnecessarily time-consuming. Many experiments involve repeating a similar measurement for several different initial conditions. For instance, you might measure the velocity of a ball upon impact after dropping it from various heights. You could drop it from one height 50 times, then drop it from a different height 50 times, and so on. Or, you could drop it from a single height 50 times, confirm that the distribution is Gaussian with a particular standard deviation, and then drop it from each other height only once. You could assume that the standard deviation of the other experiments should be about the same as the standard deviation of the first experiment. As long as the various sources of experimental error are random and uncorrelated, this assumption is reasonable. With each such measurement, you can report the expected standard deviation of that measurement. You have implicitly followed this procedure whenever you have used a standard measuring device that has a stated uncertainty. For instance, a laboratory balance might state an uncertainty of “±0.1 mg.” In this case, the manufacturer has made repeated measurements of various masses and found that the standard deviation is 0.1 mg. You could, with confidence, make a single measurement of the mass of an object and report it with an uncertainty of 0.1 mg. (Of course, you would have to be sure that 10

the balance is in good working order and that it has been calibrated properly. We spend tens of thousands of dollars each year to calibrate the laboratory equipment used in the teaching labs in the Science Center!) In order to determine the standard deviation of a measuring device, you must collect enough repeated measurements to verify that the distribution is indeed approximately Gaussian. You must also collect enough measurements to have some measurements in the “tails” of the distribution. A good rule of thumb is that a standard deviation will be fairly accurate if you collect at least 30 repeated measurements. With that number of measurements, you should obtain some measurements beyond two standard deviations from the mean (according to the “68–95–99.7” rule), and you can verify that the distribution of measurements is approximately Gaussian. Even if you know the standard deviation of a measuring device, you might still want to make repeated measurements. Making repeated measurements should not change the standard deviation of the measurement: we expect that the standard deviation is an intrinsic property of the particular experiment and measuring device. However, making repeated measurements will reduce the standard error of the mean for the measurement. The standard error of the mean for a series of repeated measurements is related to the standard deviation σ and the number of measurements N: Standard error =

" N

The standard error can be thought of as the standard deviation of the mean of a series of repeated measurements. For instance, in the above example the standard deviation is ! σ = 0.11 m/s. The experiment was repeated 50 times, so the standard error is 0.016 m/s. We could report the result of these 50 measurements in the following manner: Velocity = 1.028 ± 0.016 m/s (N = 50) Note that the reported uncertainty of ± 0.016 m/s is the uncertainty of the mean, not the standard deviation of the measurement itself. When you report a measurement in this fashion, you are implicitly reporting a distribution of measurements, not a single measurement. Providing the number of measurements (N = 50) tells the reader that you repeated the measurement 50 times. As a side note, if you are reporting a value using

11

scientific notation, you should include the standard deviation within the mantissa, as in the following example: Velocity = (1.028 ± 0.016) × 10–3 km/s (N = 50) In general, when a reader sees a measured value reported as “xxx ± yy” he or she will assume that the distribution is approximately Gaussian with a mean of xxx and a standard error of yy. You should keep that assumption in mind when reporting scientific data. Whenever you make repeated measurements, you should: i) Construct a histogram from your data. ii) Calculate the mean and standard deviation. iii) Draw a Gaussian curve for the calculated mean and standard deviation. iv) If the Gaussian curve is a reasonable fit to the observed histogram, you may report the mean and the standard error of the mean as described above. If not, you should probably report the full histogram. Knowing the standard error of the mean allows us to estimate the confidence we have in our measurement of the mean. Using the “68–95–99.7 rule”, we can be 68% confident that the true velocity is within one standard error of the mean, and 95% confident that the true velocity is within two standard errors of the mean. (Of course, this conclusion is true only if we have eliminated the possibility of systematic error.) Thus, with 50 measurements, we can state that there is a 95% chance that the true velocity lies between 1.00 and 1.06 m/s. We use these confidence intervals when we compare the results from various experiments. For example, we might perform another “ball drop” experiment with a heavier ball. As long as air resistance is negligible, the velocity upon impact should be the same with the heavy ball as it was with the light ball. If we find, for instance, that the velocity of the heavy ball is between 1.03 and 1.09 m/s (with a confidence of 95%), then the velocity of the heavy ball is statistically indistinguishable from that of the light ball measured earlier. However, if we find that the velocity of the heavy ball is between 1.08 and 1.14 m/s (at a 95% confidence level), then we can be 95% certain that the velocity of the heavy ball is indeed greater than that of the light ball. If we had made only one measurement, the standard error would be equal to the standard deviation. Suppose, for instance, that we made only one measurement of the velocity and we “got lucky”: the measurement was 1.028 m/s (the same as the mean that we obtained from making 50 measurements). We would report this observation as: Velocity = 1.03 ± 0.11 m/s (N = 1) 12

The standard error, for one measurement, is equal to the standard deviation (0.11 m/s). In this case, we could claim only that there is a 95% chance that the true velocity lies between 0.81 and 1.25 m/s. Although the standard deviation is the same in both cases, the use of repeated measurements allows us to make a much more precise statement about the mean of the distribution. You should keep in mind both the standard deviation and the standard error of the mean in any discussion or analysis of experimental measurements. Note that the standard error is inversely proportional to the square root of the number of measurements. Thus, to narrow the standard error by a factor of 10, you would need to make 100 repeated measurements. You could achieve the same result by improving the experiment and the measuring device to reduce the intrinsic standard deviation by a factor of 10. Depending on the experiment, one of these procedures may be more straightforward than the other. Some physical experiments use thousands or millions of repeated measurements—collected automatically by a computer—to reduce the standard error of the experiment to within reasonable bounds. 5. Propagation of Error You may have encountered the dreaded term “propagation of error” in a previous science course. The central concept is that any arithmetic operations on uncertain numbers will produce a result that is uncertain; the tools of “propagation of error” allow us to estimate this resulting uncertainty. We do not expect you to memorize formulas for the propagation of error: you can find such formulas in standard textbooks or on the Web. We will simply walk through one example so you can see the general concept of propagation of error and understand how it works. In your first lab activity, you will simulate sources of random measurement error using three different techniques. You will assume that the “true value” of a measured parameter is 100, and you will model the “experimental error” by rolling dice, flipping coins, and choosing random digits from a phone book. Each of these sources of error should be random, and you will add them all to the true value of 100 to yield the value that is measured by the (hypothetical) “noisy instrument”: (100) + (dice) + (coins) + (phonebook) = measurement 13

We assume that 100 has no uncertainty, since it is the “true value.” Each of the other values—dice, coins, and phonebook—has some uncertainty, as does the sum. You will calculate the standard deviations of each of these sources of error in your lab activity. Let us represent the standard deviations of the values dice, coins, phonebook, and measurement by the symbols σd, σc, σp, and σm respectively. Using these symbols, the expected standard deviation of the total measurement can be calculated from the formula for the propagation of error for addition:

" m2 = " d2 + " c2 + " 2p You will usually see this formula for the propagation of error written (equivalently) as: !

" m = " d2 + " c2 + " 2p

This formula is sometimes referred to as the “RSS” formula for propagation of error: the initials stand for “root of sum of squares.” ! There is an analogous formula for the propagation of error that uses the standard error instead of the standard deviation. That is, if we denote the standard error of the mean for the individual values as SEd, SEc, SEp and SEm the expected standard error of the mean for the overall measurement is given by:

SE m = SE d2 + SE c2 + SE p2 This fact that the squares of the individual errors are added together to yield the square of the overall error is often summarized by the statement “errors add in ! quadrature.” (Recall that a quadratic equation is an equation that contains a squared term like x2.) If errors added linearly, then the multiple sources of error in physical experiments would accumulate so quickly that it would be exceedingly difficult to make any precise measurements. As an example, consider a measurement whose “true value” is 100 in which there are five sources of error, each with a standard error of 10. If the errors added linearly, we would expect the total error to equal 50; that is, we would expect the measured values to range from 50 to 150. Since the errors add in quadrature, however, the expected standard error is:

SE = 10 2 + 10 2 + 10 2 + 10 2 + 10 2 = 22.4 which is less than half of the standard error that would be expected if the errors added linearly. We can add errors in quadrature when we expect the errors to be uncorrelated. ! 14

For instance, we expect that it is extremely unlikely that in a single experiment all the sources of error are +10, or that all the sources of error are –10. If some of the errors are correlated, we must use other formulas for the propagation of error that account for the correlations. Such considerations are beyond the scope of this course. 6. “Executive Summary” • All measurements exhibit random error, which is unavoidable, and systematic error, which can be eliminated by proper calibration of the measuring device. • Repeated measurements yield a statistical distribution that is almost always Gaussian; such a distribution is characterized fully by its mean (µ) and standard deviation (σ). • The standard deviation is a measure of the width of the distribution, and is mathematically related to the full width at half-maximum, or FWHM. • You should repeat one measurement at least 30 times with a particular measuring device to determine the intrinsic standard deviation of that device. • You may choose to repeat other measurements to minimize the standard error of the mean, which is inversely proportional to the square root of the number of measurements. • The standard error of the mean is a measure of the uncertainty of the mean; you can be 95% confident that the “true” mean lies within 2 standard errors of the measured mean. • Uncorrelated random errors add in quadrature: the overall error is the root of the sum of the squares of the individual sources of error. For more information on error analysis and propagation of errors, you should consult the excellent text by John R. Taylor, An Introduction to Error Analysis, 2nd ed., Sausalito, CA: University Science Books, 1997.

15