Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) At its core, ANOVA is a statistical test of whether or not the means of several groups are equal. Please visit the BOSS w...
55 downloads 1 Views 117KB Size
Analysis of Variance (ANOVA) At its core, ANOVA is a statistical test of whether or not the means of several groups are equal. Please visit the BOSS website for a more complete definition of ANOVA One Way ANOVA Assumptions for One Way ANOVA 1. All sample populations are normally distributed 2. All sample populations have equal variance 3. All observations are mutually independent Loading Data 1. Download the necessary file(s). You do this by right clicking on the file and choosing the save option or left clicking twice. 2. Open the folder for your directory (should include your netid) and save the pertinent files into your directory folder. This will allow you to access those files from MATLAB. 3. Open MATLAB, either remotely or just by clicking if you have it on your computer. 4. Open up a new editor window. 5. Type the clear commands: clear, clf, clc, or just clear all. 6. Type load filename.dat (do this for all the files from steps 1 and 2). 7. Run the code using the run button on the toolbar, and check the MATLAB command window to see if the data is given as rows or columns. 8. If given as columns, simply separate the matrix into column vectors by typing this into the editor window: x (or whatever you want the first variable to be called)= filename(:,1); y (or whatever you want the second variable to be called) = filename(:,2). If the data is given as rows, use filename (1,:) and filename (2,:) instead, but then remember to add a prime to the end of the line (‘) to change the rows into columns. 9. Name a matrix M, set it equal to each of the columns created in step 8. Your code should look like this: M= [x,y].

Overview

Calling the ANOVA Function 1. Type: p = anova1(M). This takes the one way ANOVA for matix M, which was created in step 9 of the above section. 2. Run the code, and the output should be an ANOVA table and a box plot of the columns of the ANOVA table. . 3. The ANOVA table has six columns.

a. The source of the variability(Source column) b. The sum of squares due to each source(SS column) c. The degrees of freedom associated with each source(df column) d. The mean squares of each source(SS/df) e. The F-statistic(the ratio of mean squares) f. The p value The ANOVA box plot of the columns of the matrix M represent the size of the F-statistic and the p-value. Large differences in the center lines of the boxes correspond to large values of the Fstatistic and small p-values.

Example 1

(Taken from http://www.mathworks.com/help/stats/anova.html) The data below comes from a study by Hogg and Ledolter of bacteria counts in shipments of milk. The columns of the matrix hogg represent different shipments. The rows are bacteria counts from cartons of milk chosen randomly from each shipment. Do some shipments have higher counts than others?

Create the following matrix: hogg = 24 15 21 27 33 23

14 7 12 17 14 16

11 7 19 9 7 24 7 4 19 13 7 15 12 12 10 18 18 20

Find the p-value for this data set and also produce an ANOVA table and a box plot representing the data. p= anova1(hogg); p= 1.1971e-04

Two Way ANOVA Assumptions for Two Way ANOVA 1. All sample populations are normally distributed 2. The samples are independent 3. The variances of the populations are equal

4. The groups have the same sample size

Loading Data 1. Download the necessary file(s). You do this by right clicking on the file and choosing the save option or left clicking twice. 2. Open the folder for your directory (should include your netid) and save the pertinent files into your directory folder. This will allow you to access those files from MATLAB. 3. Open MATLAB, either remotely or just by clicking if you have it on your computer. 4. Open up a new editor window. 5. Type the clear commands: clear, clf, clc, or just clear all. 6. Type load filename.dat (do this for all the files from steps 1 and 2). 7. Run the code using the run button on the toolbar, and check the MATLAB command window to see if the data is given as rows or columns. 8. If given as columns, simply separate the matrix into column vectors by typing this into the editor window: x (or whatever you want the first variable to be called)= filename(:,1); y (or whatever you want the second variable to be called) = filename(:,2). If the data is given as rows, use filename (1,:) and filename (2,:) instead, but then remember to add a prime to the end of the line (‘) to change the rows into columns. 9. Name a matrix M, set it equal to each of the columns created in step 8. Your code should look like this: M= [x,y].

Overview

Calling the ANOVA Function 1. Type: p = anova2(M). This takes the two way ANOVA for matrix M, which was created in step 9 of the above section. 2. Run the code, and the output should be an ANOVA table. 3. The ANOVA table has six columns. a. The source of the variability(Source column) b. The sum of squares due to each source(SS column) c. The degrees of freedom associated with each source(df column) d. The mean squares of each source(SS/df) e. The F-statistic(the ratio of mean squares) f. The p value 4. Unlike the One Way ANOVA, there are four rows instead of two, which the 2 additional ones being Interaction and Rows.

Example 2

Given the following table, prepare an ANOVA table using matlab:

First, input or load these values into matlab using a matrix: where the columns from left to right are Gourmet, Nat’l Brand, and Generic. The first three rows represent the oil popper and the last three rows represent the air popper.

*Running the code p = anova2(popcorn,3) takes the data “popcorn”, where the integer “3” represents the number of data points in each group, and runs ANOVA. So, in the “popcorn” data set, there are 3 data values in Oil/Gourmet, Oil/Nat’l Brand, etc.

This code will generate the above table, which you can find all the information you need for ANOVA tests. The vector p shows the p-values for the three brands of popcorn, 0.0000, the two popper types, 0.0001, and the interaction between brand and popper type, 0.7462. These values indicate that both popcorn brand and popper type affect the yield of popcorn, but there is no evidence of a synergistic (interaction) effect of the two.