Organizing and Graphing Data

Chapter Organizing and Graphing Data 2 Excel’s row/column layout is a natural fit for organizing most data. It is also useful for creating very qui...
Author: Linda May
2 downloads 0 Views 775KB Size
Chapter

Organizing and Graphing Data

2

Excel’s row/column layout is a natural fit for organizing most data. It is also useful for creating very quick and simple charts. Take care to ensure that the charts are appropriately labeled, titled and do not mislead. The ways to make some charts—such as the histogram—are not inherently obvious. Excel does not provide options for making the dotplot and the boxplot, for example, but you can create them with some ingenuity. In this chapter, we will cover some of the most basic Excel functions and charts and how they can be utilized to organize data. This chapter includes an introduction to the Data Analysis ToolPak, which is a free Excel Add-in. In order to use the Data Analysis ToolPak, you’ll need to add it to your Data tab. To do so, select the Microsoft button then choose Excel Options then Add-ins and then Analysis Toolpak and click OK. Check the Analysis Toolpak and click OK. This may disappear at times, if this happens you have to go back through the process of adding it. To do add it again, go through the process as if you were going to add it, unselect Analysis Toolpak first, click OK, then you can add it in again.

Organizing and Graphing Qualitative Data Excel contains an autocomplete feature that makes entering qualitative data into a spreadsheet easier. As you type data into a column, if you have typed an entry before, Excel will give you the option to select that item after you key the first letter(s). Example 2-1 on page 30 contains data on responses of people who were asked what variety of donut is their favorite. Respondents chose from five possible responses: Glazed, Frosted, Filled, Plain, and Other. Begin entering the information into a new spreadsheet. In A1, add the label: “Donut Variety” Then begin entering the data in column A, starting in A2. As you are entering the data, you will notice that if an item has been entered before, the next time you begin typing it in, Excel autocompletes the word as shown in the illustration below. If you want to accept the suggested autocompletion, simply press the key. Otherwise, continue to type.

Chapter 2/ Introductory Statistics - Mann

2

Excel Manual

Once you’ve entered all possibilities, entry becomes as simple as hitting the first letter or letters of the desired entry and hitting . Once the data is entered, you can use Excel to tally the responses using the COUNTIF function. =COUNTIF(range,criteria) counts the number of cells that equal the criteria. Enter the following in the spreadsheet: C1: “Response” D1: “Tally” C2: “Glazed” C3: “Frosted” C4: “Filled” C5: “Plain” C6: “Other” D2: =COUNTIF(A:A,C2) This formula will count the number of times the response “Glazed” (which is in C2) occurs in the range A:A (or column A.) Autofill the formula from D2 to D6 to count the other four values. The sheet is shown below:

The results should appear as such:

Chapter 2/ Introductory Statistics - Mann

3

Excel Manual

Now that you have an area that shows the tallies, you can create a pie chart from the data. Select cells C1:D6, then select the 2-D Pie chart from the Charts group in the Insert tab.

Enter an appropriate title by clicking on the title in the chart, then select the existing text and type the desired title. The completed pie chart is shown below:

Chapter 2/ Introductory Statistics - Mann

4

Excel Manual

Once created, the pie chart can easily be modified by using the options within the Chart Tools tab. This lets you adjust the colors and labels, add titles, etc.

Organizing and Graphing Quantitative Data Example 2-3 on page 39 of the text uses data on the total number of iPods® sold by a mail-order company on each of 30 days. The following instructions show how to set up the frequency charts for creating a histogram. Open a new Excel worksheet and enter this data with an appropriate label in column A. Once the data is in, add the following: C1: “Statistic” D1: “Value” C2: “n” C3: “Min” C4: “Max” C5: “Number of Classes” D2: =COUNT(A:A) This formula counts the number of values in column A D3: =MIN(A:A) This determines the minimum value for column A. D4: =MAX(A:A) This determines the maximum value in column A. D5: 5 This is the number of classes desired for the histogram.

Chapter 2/ Introductory Statistics - Mann

5

Excel Manual

D6: =ROUND((D4-D3)/D5,0) This calculates the class width for minimum, maximum and number of classes and rounds it to the nearest integer. Formulas and results are shown below:

To create the area for the histogram, enter the following values: F1: “Class” G1: “Frequency” F2: Enter a value greater than the minimum but less than the minimum + class width to serve as a convenient start of the histogram. In this example, we’ll use the number 9 to coincide with the text example. (Note: This number may be changed after the histogram is made; if any alterations are made, the histogram will automatically adjust to the new value.) F3: =F2+$D$6 In this example, D6 is the class width calculated in cell D6. As a review, “$” is an absolute reference which keeps the letters from changing when the cells are copied. Autofill the formula from F3 down until you see a number at least as large as the maximum value. In this case, the maximum is 29, which falls in cell F6. =FREQUENCY(Data Array, Bin Array) This counts the number of data items in the data array that are less than or equal to the value in the corresponding Bins Array (class) but greater than the previous corresponding Bin Array value. Important: FREQUENCY() is an array function, which essentially means that a group of cells work together to calculate a group of related values. Array functions require you to select a group of cells, key the formula, and press . The formulas in the group of cells will be all the same with brackets ({})around them. The brackets will only appear in the data entry bar not in the formulas in the cells. Select cells G2:G8: =FREQUENCY(A:A,F:F) The results are shown below:

Chapter 2/ Introductory Statistics - Mann

6

Excel Manual

Now that we have the classes and the frequencies, we can create a histogram.

Chapter 2/ Introductory Statistics - Mann

7

Excel Manual

Select cells F1:G6, then select the Column Chart from the Chart group in the Insert ribbon, as illustrated below:

Select Layout 8 from the Quick Layouts group in the Design ribbon.

Excel may try to treat the first column as a column of data instead of labels, so you must manually enter labels for the chart to use to designate each class, otherwise the class label will be incorrect. Enter the desired class labels in column H as shown in the illustration below.

Chapter 2/ Introductory Statistics - Mann

8

Excel Manual

Note: When entering 5-9, Excel will assume it is a date. To prevent this, enter an apostrophe (‘) before entering the class, e.g. ‘5-9. Now, click on Select Data of the Data Group in the Design tab. Select the Class Widths from the left Legend Entries (Series) window and select Remove. Select Edit and select the bins you entered in G2:G13. The dialog box should look like the following illustration:

Formatting the Histogram Properly To format the histogram’s bars, right-click on any bar in the chart, select Format Data Series and change as desired (you might add borders to the bars, etc.). Click on the Labels and enter appropriate titles.

If you would like gaps in your histogram on either side of the histogram (i.e. where 0-4 and 3034 would be), you can include these in your frequency column. These will have a frequency of 0, thus providing the extra space outside the chart area.

Creating a Frequency Histogram Using the Data Analysis ToolPak These instructions show how to use the Data Analysis ToolPak to create a histogram. For this example, we will use the same values as the previous graph. Follow the previous example to create the class column. Chapter 2/ Introductory Statistics - Mann

9

Excel Manual

Select Data Analysis in the Data ribbon and choose Histogram in the dialog box.

Select the Input Range, making sure you do not select any text or non-numeric cells other than the label at the top. Select the class range that was created in column F for Bins, again ensuring that no text is selected other than the label at the top. Select the Labels option and the Chart Output option as shown in the following illustration:

The following table and chart have been created on a new sheet:

Histogram formatting and proper labeling is described in the previous example.

Chapter 2/ Introductory Statistics - Mann

10 Excel Manual More on Frequency Distributions, Relative Frequencies The following example will demonstrate how to create a frequency chart that counts the number of values, relative frequency and percentage distributions for each range of values. In columns A and B, enter the data from Example 2-5 on page 44 in your text, which lists the percentage of the population working in 2010. Also, enter the labels as shown in the illustration below:

Enter the following: D2: 1.5 D3: 2.0 Format these two cells as Currency using the drop-down list in the Number group of the Home tab. Select D2:D3 and autofill until $4.00 is entered, since there is no tax larger than $4.00 in the table. (This illustrates how an arithmetic sequence may be automatically entered by entering the first values in the sequence, selecting them and autofilling until the sequence is complete.)

Use the frequency formula as in the previous examples, select cells E2:E8 and enter the following formula: =FREQUENCY(B:B,D:D) E9: =SUM(E2:E8) This will calculate the total frequency. F2: =E2/$E$9 Format this cell with two decimal places. Autofill this formula to F9.

Chapter 2/ Introductory Statistics - Mann

11 Excel Manual G2: =F2 Format this cell as a percentage and autofill it to G9.

The final results are as follows:

Cumulative Frequency Distribution To calculate a cumulative frequency the SUM() function is used, with a relative reference for the beginning of the sum range and an absolute reference for the end of the range. This way, the formula will always include all the cells above the row the formula is in. Open the spreadsheet used in Example 2-3 on iPod® sales and add the following: I1: “Cumulative Frequency” I2: =SUM($G$2:G2) Autofill this formula down to the remainder of the table. As a review, $G$2 is an absolute reference which keeps the letters from changing when the cells are copied. G2 is a relative reference, allowing the reference to change as the formula is filled. The effect of this is that as you move down the column, the range to be summed increases as to include the next cell. The cumulative percentage is then calculated just as the relative frequency was earlier, as shown in the illustration:

Chapter 2/ Introductory Statistics - Mann

12 Excel Manual The results are as follows:

Now, you can use this information to make any chart using the cumulative frequency or cumulative percentage instead of the actual frequency.

Chapter 2/ Introductory Statistics - Mann