INTRODUCTION TO SPSS

INTRODUCTION TO SPSS OFID WORKSHOP March 2008 Adil Yousif, Ph. D. Types of Variables: Quantities are classified into two categories, Constants and V...
Author: Tracy Powell
10 downloads 2 Views 162KB Size
INTRODUCTION TO SPSS OFID WORKSHOP March 2008 Adil Yousif, Ph. D.

Types of Variables: Quantities are classified into two categories, Constants and Variables There are two types of variables: Qualitative and they are either nominal or ordinal (string) Quantitative and they are numerical (numeric)

Variable Names The following rules apply to variable names: • Each variable name must be unique; duplication is not allowed. • Variable names can be up to 64 bytes long, and the first character must be a letter or one of the characters @, #, or $. Subsequent characters can be any combination of letters, numbers, no punctuation characters, and a period (.). In code page mode, sixty-four bytes typically means 64 characters Variable names cannot contain spaces. • The period, the underscore, and the characters $, #, and @ can be used within variable names. For example, A._$@#1 is a valid variable name. • Variable names ending with a period should be avoided, since the period may be interpreted as a command terminator. • Variable names ending in underscores should be avoided, since such names may conflict with names of variables automatically created by commands and procedures. • Reserved keywords cannot be used as variable names. Reserved keywords are ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, and WITH.

• Variable names can be defined with any mixture of uppercase and lowercase characters, and case is preserved for display purposes.

Data Editor The Data Editor provides a convenient, spreadsheet-like method for creating and editing data files. The Data Editor window opens automatically when you start a session. The Data Editor provides two views of your data: • Data View. This view displays the actual data values or defined value labels. • Variable View. This view displays variable definition information, including defined variable and value labels, data type (for example, string, date, or numeric), measurement level (nominal, ordinal, or scale), and user-defined missing values. In both views, you can add, change, and delete information that is contained in the data file.

Entering Data In Data View, you can enter data directly in the Data Editor. You can enter data in any order. You can enter data by case or by variable, for selected areas or for individual cells. • The active cell is highlighted. • The variable name and row number of the active cell are displayed in the top left corner of the Data Editor. • When you select a cell and enter a data value, the value is displayed in the cell editor at the top of the Data Editor.

• Data values are not recorded until you press Enter or select another cell. • To enter anything other than simple numeric data, you must define the variable type first. If you enter a value in an empty column, the Data Editor automatically creates a new variable and assigns a variable name.

Descriptive The Descriptive procedure displays univariate summary statistics for several variables in a single table and calculates standardized values (z scores). Variables can be ordered by the size of their means (in ascending or descending order), alphabetically, or by the order in which you select the variables (the default). The statistics calculated in Descriptive procedure are: Sample size, mean, minimum, maximum, standard deviation, variance, range, sum, standard error of the mean, and kurtosis and skewness with their standard errors.

To obtain Descriptive Statistics

From the menus choose: Analyze Descriptive Statistics Descriptive.

Select one or more variables. Optionally, you can:



Select Save standardized values as variables to save z scores as new variables.



Click Options for optional statistics and display order.

For one-sample Z-test: Get the sample mean from the descriptive procedure above From menus choose Transform Select compute Use the given values of the population mean and standard deviation to calculate the Zvalue (Note that the critical values of Z are: 1.65 for (α = 0.1), 1.96 for (α = 0.05), 2.58 for (α = 0.01).

Histograms The histogram plot shows the distribution of a single numeric variable. To obtain a histogram From the menus, choose: Graphs – Legacy Dialogs Histogram Select a numeric variable for Variable in the Histogram dialog.

Select Display normal curve to display a normal curve on the histogram. To panel the chart, move one or more categorical variables into the Panel by group. Select Titles to define lines of text to be placed at the top or bottom of the plot.

Boxplots Boxplot allows you to make selections that determine the type of chart you obtain. Boxplots show the median, interquartile range, outliers, and extreme cases of individual variables.To obtaining Simple and Clustered Boxplots:

From the menus, choose: Graphs – Legacy Dialogs Boxplot

In the Boxplot initial dialog box, select the icon for simple or clustered.

Select an option under Data in Chart Are.

Select Define.

Select variables and options for the chart. P.S. From the Frequency tables procedure you can also obtain the following plots: bar charts, pie charts, and histograms.

To obtain Frequency Tables:

From the menus choose: Analyze Descriptive Statistics Frequencies...

Select one or more categorical or quantitative variables.

Bivariate Correlations The Bivariate Correlations procedure computes Pearson's correlation coefficient, Spearman's rho, and Kendall's tau-b with their significance levels. Correlations measure how variables or rank orders are related.. Pearson's correlation coefficient is a measure of linear association. Two variables can be perfectly related, but if the relationship is not linear, Pearson's correlation coefficient is not an appropriate statistic for measuring their association. For each variable the statistics that are displayed are: number of cases with nonmissing values, mean, and standard deviation. For each pair of variables the statistics are: Pearson's correlation coefficient, Spearman's rho, Kendall's tau-b, cross-product of deviations, and covariance.

Scatterplots Scatterplot allows you to specify the type of scatterplot you want. To obtaining Scatterplots

From the menus, choose: Graphs – Legacy Dialogs Scatter/Dot

In the Scatterplot dialog, select the icon for simple, overlay, matrix, 3-D, or simple dot plot.

Select Define.

Select variables and options for the chart To obtain Bivariate Correlations From the menus choose: Analyze Correlate Bivariate...

Select two or more numeric variables. P.S. Before calculating a correlation coefficient, screen your data for outliers (which can cause misleading results) and evidence of a linear relationship.

Linear Regression Linear Regression estimates the coefficients of the linear equation, involving one or more independent variables, that best predict the value of the dependent variable. The statistics calculated for each variable are: number of valid cases, mean, and standard deviation. For each model the statistics shown are: regression coefficients, correlation matrix, part and partial correlations, multiple R, R2, adjusted R2, change in R2, standard error of the estimate, analysis-of-variance table, predicted values, and residuals. Also, 95%-confidence intervals for each regression coefficient, variance-covariance matrix, variance inflation factor, tolerance, Durbin-Watson test, distance measures (Mahalanobis, Cook, and leverage values), DfBeta, DfFit, prediction intervals, and casewise diagnostics. Plots: scatterplots, partial plots, histograms, and normal probability plots.

To obtain a Linear Regression Analysis

From the menus choose: Analyze Regression Linear...

In the Linear Regression dialog box, select a numeric dependent variable.

Select one or more numeric independent variables. Optionally, you can:



Group independent variables into blocks and specify different entry methods for different subsets of variables.



Choose a selection variable to limit the analysis to a subset of cases having a particular value(s) for this variable.



Select a case identification variable for identifying points on plots.



Select a numeric WLS Weight variable for a weighted least squares analysis.

One-Sample T-Test The One-Sample T-Test procedure tests whether the mean of a single variable differs from a specified constant. For each test variable the statistics calculated are: mean, standard deviation, and standard error of the mean. The average difference between each data value and the hypothesized test value, a t test that tests that this difference is 0, and a confidence interval for this difference (you can specify the confidence level).

To obtain a One-Sample T- Test

From the menus choose: Analyze Compare Means One-Sample T-Test...

Select one or more variables to be tested against the same hypothesized value.

Enter a numeric test value against which each sample mean is compared.

Optionally, you can click Options to control the treatment of missing data and the level of the confidence interval.

Paired-Samples T-Test The Paired-Samples T-Test procedure compares the means of two variables for a single group. It computes the differences between values of the two variables for each case and tests whether the average differs from 0. The statistics calculated for each variable are: mean, sample size, standard deviation, and standard error of the mean. For each pair of variables: correlation, average difference in means, t test, and confidence interval for mean difference (you can specify the confidence level). Standard deviation and standard error of the mean difference.

To obtain a Paired-Samples T- Test

From the menus choose: Analyze Compare Means Paired-Samples T- Test...

Select a pair of variables, as follows: Optionally, you can click Options to control the treatment of missing data and the level of the confidence interval.

Independent-Samples T-Test The Independent-Samples T-Test procedure compares means for two groups of cases in which subjects are randomly assigned. The statistics calculated for each variable are: sample size, mean, standard deviation, and standard error of the mean. For the difference in means the statistics are: mean, standard error, and confidence interval (you can specify the confidence level). The inference tests are: Levene's test for equality of variances, and both pooled- and separate-variances t tests for equality of means.

To obtain an Independent-Samples T-Test

From the menus choose: Analyze Compare Means Independent-Samples T-Test...

Select one or more quantitative test variables. A separate t test is computed for each variable.

Select a single grouping variable, and click Define Groups to specify two codes for the groups that you want to compare. Optionally, you can click Options to control the treatment of missing data and the level of the confidence interval.

Chi-Square Test The Chi-Square Test procedure tabulates a variable into categories and computes a chisquare statistic. This goodness-of-fit test compares the observed and expected frequencies in each category to test either that all categories contain the same proportion of values or that each category contains a user-specified proportion of values. Statistics. Mean, standard deviation, minimum, maximum, and quartiles. The number and the percentage of nonmissing and missing cases, the number of cases observed and expected for each category, residuals, and the chi-square statistic.

To Obtain a Chi-Square Test

From the menus choose: Analyze Nonparametric Tests Chi-Square...

Select one or more test variables. Each variable produces a separate test. Optionally, you can click Options for descriptive statistics, quartiles, and control of the treatment of missing data.

One-Way ANOVA The One-Way ANOVA procedure produces a one-way analysis of variance for a quantitative dependent variable by a single factor (independent) variable. Analysis of variance is used to test the hypothesis that several means are equal. This technique is an extension of the two-sample t test.

In addition to determining that differences exist among the means, you may want to know which means differ. There are two types of tests for comparing means: a priori contrasts and post hoc tests. Contrasts are tests set up before running the experiment, and post hoc tests are run after the experiment has been conducted. You can also test for trends across categories. Statistics. For each group: number of cases, mean, standard deviation, standard error of the mean, minimum, maximum, and 95%-confidence interval for the mean. Levene's test for homogeneity of variance, analysis-of-variance table and robust tests of the equality of means for each dependent variable, user-specified a priori contrasts, and post hoc range tests and multiple comparisons: Bonferroni, Sidak, Tukey's honestly significant difference, Hochberg's GT2, Gabriel, Dunnett, Ryan-Einot-Gabriel-Welsch F test (R-EG-W F), Ryan-Einot-Gabriel-Welsch range test (R-E-G-W Q), Tamhane's T2, Dunnett's T3, Games-Howell, Dunnett's C, Duncan's multiple range test, Student-Newman-Keuls (S-N-K), Tukey's b, Waller-Duncan, Scheffé, and least-significant difference.

To Obtain a One-Way Analysis of Variance

From the menus choose: Analyze Compare Means One-Way ANOVA...

Select one or more dependent variables.

Select a single independent factor variable

Suggest Documents