MSc Business Administration

Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 01: Introduction to SPSS

February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi Bruderer Enzler

Slide 2

Contents

Resources I ______________________________________________________________________________________________ 3  First Steps _______________________________________________________________________________________________ 4  Design of the Data Editor and Datasets _______________________________________________________________________ 6  Running Analyses with SPSS _______________________________________________________________________________ 9  Special Issue: Using the Syntax Editor ______________________________________________________________________ 14  Data Entry ______________________________________________________________________________________________ 29  Modifying data values ____________________________________________________________________________________ 21  Select Cases & Split File __________________________________________________________________________________ 25  Resources II ____________________________________________________________________________________________ 29 

Slide 3

Resources I Manuals This introduction is based on the manual IBM SPSS Statistics 21 Brief Guide. You can find this manual and the IBM SPSS Statistics 21 Core System User's Guide here: ◦ Ilias ◦ http://www-01.ibm.com/support/docview.wss?uid=swg27024972

Slide 4

First Steps Change the Application Language The language can be selected through the "General" tab under EditOptions:

Slide 5

Sample Dataset "demo.sav" This introduction is based on the demo.sav dataset. It is a fictional survey that includes basic demographic data and consumer information of several thousand persons (n = 6400). Name age marital address income inccat car carcat ed employ retire empcat jobsat gender reside wireless multline voice pager internet callid callwait owntv ownvcr owncd ownpda ownpc

Label Age in years Marital status Years at current address Household income in thousands Income category in thousands Price of primary vehicle Primary vehicle price category Level of education Years with current employer Retired Years with current employer Job satisfaction Gender Number of people in household Wireless service Multiple lines Voice mail Paging service Internet Caller ID Call waiting Owns TV Owns VCR Owns stereo/CD player Owns PDA Owns computer

….

Slide 6

Opening the Data File "demo.sav" In preparation for today, you have saved the dataset "demo.sav" locally. You did find it in one of the following locations: ◦ On your laptop: C\…\Programs\IBM\SPSS\Statistics\21\Samples\ ◦ On Ilias: … » Lectures » Lecture 01 Introduction » Data Resources

To open this dataset you can do either of the following: ◦ Double click on SPSS data file (the dataset "demo.sav") ◦ Use the menu FileOpenData…

Slide 7

Design of the Data Editor and Datasets The Two Parts of the Data Editor The Data Editor shows the contents of the current data file.

Data view

Variable view

Columns represent variables, rows represent cases.

Each row is a variable, each column is an attribute of the variable.

Slide 8

Structure of an SPSS Dataset SPSS data are organized according to cases (rows) and variables (columns). Data View

Cases (Rows) In a survey of individuals, each row represents a respondent. In a scientific experiment, each row usually corresponds to a measurement.

Variables (Columns) Each column of the data editor corresponds to a particular attribute. In many areas of research, these measurements are called variables.

Slide 9

Running Analyses with SPSS Running an Analysis The "Analyze" menu contains different methods of analysis. For example a simple frequency table with histogram: AnalyzeDescriptive StatisticsFrequencies…

Slide 10

Intermezzo: Alphabetical View of the Variables in the Dialog Boxes By default, SPSS dialog boxes display variables with their labels:

Variables are displayed with their labels.

This could make the search for particular variables difficult.

Slide 11

SPSS can be adjusted so that variables are displayed with their names and in alphabetical order. To do so, select the following setting under the General tab of EditOptions:

Slide 12

Variables are displayed alphabetically by their names.

Place the cursor in the box that contains the variables, and enter a character from the keyboard. The first variable beginning with this character will appear. This allows you to quickly search through the variable box to find a variable.

Slide 13

Viewing Results: SPSS Output The output includes the syntax of the command and its results (frequency table, histogram).





Syntax is the internal "language" of SPSS.

Slide 14

Special Issue: Using the Syntax Editor Structure of SPSS Data Editor

Output

Syntax Editor *.sav files

*.spv files

*.sps files

Slide 15

Working with Syntax Open a new syntax file through the menu: FileNewSyntax

Data Editor

Output

Syntax-Editor *.sav files

*.spv files

*.sps files

Slide 16

How do you get the command syntax? Option I: Perform an analysis through the menu Example: AnalyzeDescriptive StatisticsFrequencies

Data Editor

Output

Slide 17

Where is the syntax for this analysis? => The syntax is displayed in the output. Double-click the syntax part in the log, highlight and copy the syntax.

Paste the syntax into the Syntax Editor.

Slide 18

Option II:

Paste the syntax directly from the dialog box ("Paste" button).

Option III:

Write the syntax yourself.

Executing the Syntax Place the cursor inside the syntax editor and run the analysis through the menu RunSelection.

Slide 19

Typical Syntax File

Why should you use syntax? Rapidly leads to greater efficiency. Documentation Reproducing the results Automatically process many commands Allows access to all commands Communication with other persons Opens the world of macros

Slide 20

What if the Syntax is not Displayed in the Output? Through the menu EditOptions…Viewer, choose "Display commands in the log"

The syntax is now displayed in the output.

Slide 21

Modifying Data Values The data may not always exist in a form that can be used for analysis or reporting. For example, you may want to: ◦ convert a scale variable into a categorical variable. ◦ merge different response categories into a single category. ◦ calculate a new variable from the difference between two existing variables.

Slide 22

Computing a new variable New variables can be computed based on existing ones, for example by averaging scores, summing them up etc. For example you may want to compute the equivalence income (based on the household income and the number of persons in the household). TransformCompute Variable…

Syntax COMPUTE income_equiv = income / SQRT(reside).

Slide 23

Recoding a variable Example: creating a categorical variable from a scale variable. For example, based on age in years we could build age categories. Menu: TransformRecode into Different Variables…

Slide 24

Syntax RECODE age (Lowest thru 24=1) (25 thru 44=2) (45 thru 60=3) (61 thru Highest=4) INTO age_r. FREQUENCIES VARIABLES=age age_r /ORDER ANALYSIS.

Result Scale values (age)

Categorical values (age_r)

==>

:

Categories 1: up to 24 years 2: 25 - 44 years 3: 45 - 60 years 4: over 60 years

Slide 25

Select Cases & Split File Select cases A particular subset of the data can be analyzed by selecting specific cases. Through this, all undesired cases of your data set are either temporarily or permanently deleted. For example, you may want to analyze only respondents who are older than 45 years. Menu: DataSelect Cases…

Slide 26

Syntax

Result

USE ALL. COMPUTE filter_$=(age > 45). FILTER BY filter_$. EXECUTE .

FREQUENCIES VARIABLES=age /FORMAT=NOTABLE /HISTOGRAM /ORDER=ANALYSIS.

FILTER OFF. USE ALL. EXECUTE .

These lines remove the "filter" for all analyses to come.

Slide 27

Split File Sometimes data in different categories should be analyzed separately. To do this, the data can be split up, and the same analysis can be performed on two or more datasets. For example, we could split the dataset by means of the variable age_r which means we are conducting separate analyses for each of the age categories. Menu: DataSplit File…

Slide 28

Syntax

Result

SORT CASES BY age_r . SPLIT FILE SEPARATE BY age_r .

FREQUENCIES VARIABLES=income /FORMAT=NOTABLE /HISTOGRAM /ORDER=ANALYSIS.

SPLIT FILE OFF.

This line removes the split for all analyses to come.

Slide 29

Data Entry Data Entry Options There are different ways to enter data into SPSS. Data can be directly entered into SPSS or can be imported from many different sources: ◦ Direct: SPSS Data Editor ◦ From a spreadsheet program (such as Excel) ◦ From a database program (such as Access) ◦ From other applications (such as a text editor) Scanners may be efficient for entering large amounts of data.

Slide 30

Data Editor: Defining Variables, Entering Data & Missing Values Entering (new) numerical data Open a new data file (through the menu FileNewData) At the bottom of the Data Editor window, switch to Variable View. ◦ Enter age in the first row of the first column. ◦ Enter marital in the second row. ◦ Enter income in the third row. New variables are automatically assigned the "Numeric" data type.

Slide 31

Switch to the Data View in order to enter values.

To suppress the decimal place for the variables age, marital and income: ◦ At the bottom of the Data Editor window, switch to Variable View. ◦ Select the "Decimals" column and enter a 0 for age. ◦ Select the "Decimals" column and enter a 0 for marital.

Slide 32

Adding variable labels and value labels Enter "Respondent's age" into the age cell of the "Labels" column. Do the same for "Marital Status", and so on. Select the Values cell for marital and open the dialog box. ◦ For "Value", enter 1. ◦ For "Label", enter "single". ◦ Click on "Add" so that this designation is registered.

Slide 33

Handling missing values In general, missing or invalid data should not be ignored. Sometimes survey participants refuse to answer particular questions. They may not know an answer, or may respond in an unexpected way. If these data are not identified or filtered out, your analysis may not yield correct results.

Empty data cells, or cells that contain invalid input, are converted to missing values, which are displayed as a period.

Slide 34

The reason why data is missing could be important for your analysis. For example, for a particular question, it could be useful to distinguish between those who refused to answer and those for whom the question was not applicable. In "Variable View" select the "Missing" cell for income and open the dialog box. In this dialog box you can specify up to three different missing values, either by defining a range of values, or particular single values.

Slide 35

Importing Data Data can be imported from different sources. ◦ Reading an SPSS Data File SPSS data files have a file extension of *.sav. ◦ Importing data from a spreadsheet In addition to entering data into the data editor, you can import from programs such as Microsoft Excel. The column headings serve as variable names. ◦ Importing data from a text file Text files are common sources of data. Many spreadsheet programs and databases can save their contents in text file format. For example, in CSV files, variables are separated with commas or tabs. ◦ Importing data from a database (not in this course) Data from a database can be imported with the help of a database wizard.

Slide 36

Importing data from a spreadsheet Search for the Excel file demo.xls. ◦ On your computer: In the "Samples" subdirectory of the installation directory C\…\Programs\IBM\SPSS\Statistics\21\Samples\ ◦ On Ilias: … » Lectures » Lecture 01 Introduction » Data Resources

Column headings = Variable names

:

Slide 37

Open the Excel file through the SPSS File menu (Excel file must be closed)

Slide 38

Importing data from a text file Search for the text file demo.txt ◦ On the Computer under: C\…\Programs\IBM\SPSS\Statistics\21\Samples\ ◦ On Ilias: > … > Data Resources Open the text file through the SPSS File menu (text file must be closed)

Slide 39

Slide 40

Resources II SPSS Help System (Core System User's Guide) Help Menu (the most important entries) ◦ Topics: This allows you to access the tabs Contents, Index and Search. Use these tabs to search for particular Help topics. Tutorial: Step-by-step instructions for many basic functions. ◦ Case studies: Practical examples to design different types of statistical analyses and for the interpretation of results. ◦ Statistics Coach: This coach helps you to find the procedure that you would like to use. The Statistics Coach offers access to most of the procedures. ◦ Command Syntax Reference: Detailed information about command syntax is available from two sources: as a component of the Help system, and as a separate PDF-document in the Command Syntax Reference manual, which is also available through the Help menu. Context dependent help ◦ Dialog box help: Most dialog boxes contain the Help button, through which you can call up corresponding help topics for the dialog box. ◦ Pivot Table Context Help Menu: If you right-click on a term in a viewer-activated pivot table, and then select Direct Help from the context menu, you obtain a definition of the term. ◦ Command syntax: Place your cursor inside a block of command syntax in the command syntax window, and press the F1 key on the keyboard.

Slide 41

Help Menu:

=>

Help dialog box

=>

Slide 42

Tutorials

:

Slide 43

Online Resources SPSS Solutions for Education www-01.ibm.com/software/analytics/spss/academic/students/resources.html User [email protected] Password 7mydevelopper SPSS Support (primarily for the Knowledge Base) http://support.spss.com/tech/default.asp User spssswitzerland Password spssswitzerland

www.dynelytics.com

SPSS Support (resources for all levels of users and application developers) www.spss.com/devcentral User [email protected] Password 7mydevelopper Other Resources / Forum / Discussion www.ats.ucla.edu/stat/spss http://spssx-discussion.1045642.n5.nabble.com www.spssusers.co.uk

Slide 44

Notes: