SPSS Basics:  An Introduction to SPSS System for Windows

 

SSTARS Center

The Center for Statistical Computing Support

2-2 Willaim T. Young Library

Open: 8am – 5pm Monday – Friday

 

 

Consultants: 

Aric Schadler (schadler@uky.edu)

257-2641

Adam Lindstrom (adam.lindstrom@uky.edu)

257-2643

 

 

References:

SSTARS web page:  www.uky.edu/ComputingCenter/SSTARS

SPSS home page:  www.spss.com

 

Additional Resources:

Tutorials from UCLA

Tutorials from Texas A&M

 


1. Data Files and Data Manipulation
(i)Creating an SPSS data file
We will enter the data from the attached surveys into an SPSS data file. Start by opening SPSS. A New data view appears. It is similar to Excel Spreadsheet. Before entering our data, we must define the attributes to our variables. Click on the variable view tab at the bottom left hand corner of the window. Here we will enter the variable attributes: variable name, type, variable labels and value labels, missing values and measure.

Variable(question)

 

Type

Variable label
(description)

Values and value labels (data code)

 

Measure

ID

Numeric

Case Number

1--- 999

Nominal

Gender

Strig

Repondent's Gender

F=Female

M=Male

Nominal

Marital

Numeric

Marital Status

1 = Married 

2 = Widowed 

3 = Divorced

4 = Separated 

5 = Never Married

9 = No Answer 

Nominal

BYear

Numeric

Birth Year

19-- -- 

9998=DK

9999=No Answer

Scale

TIME

Numeric

Respondent reads TIME

1=Yes

0=No

Nominal

NEWSWEEK

Numeric

Respondent reads NEWSWEEK

1=Yes

0=No

Nominal

MONEY

Numeric

Respondent reads MONEY

1=Yes

0=No

Nominal

USNEWS

Numeric

Respondent reads USNEWS

1=Yes

0=No

Nominal

OTHER

Numeric

Respondent reads OTHER

1=Yes

0=No

Nominal

If you make a mistake, correct it the same way you would in a spreadsheet; select the field you want to change, type in the new value and press enter.

 

Click on data view tab at the bottom left hand corner of window and proceed to enter data into cells just as you would in a spreadsheet.

 

To turn on value labels: Click View>value Labels.

Saving an SPSS data set: click on File; choose Save as or Save Data. Choose directory in which you want your file to reside and press OK. The file will be saved as an SPSS data file (extension .sav).

(ii)Editing and Modifying SPSS data files
To open and edit a data file click on File>Open>Data. Go to the directory in which your file is located, select the file and press OK. You will see the file in a spreadsheet form.


Adding new variable: Click on variable view tab. Right click the row before which you want to insert the new variable and choose Insert Variable.  Alternatively you can click the following icon on the toolbar.  

 Adding new case: Click on data view tab. Right click the row before which you want to insert the new case and choose Insert Cases.  Alternatively you can click the following icon on the toolbar. 

 Computing new variables: Suppose that we want to compute two new variables and add them to the data file created from the attached surveys. The variables AGE and MAG will be defined as follows:

AGE=2002-BYear

MAG= 1 if a person reads any magazines

            0 if a person doesn’t read any magazines

 

Click Transform>Compute. The compute variable dialog box appears. Begin by specifying a new variable name in the Target Variable box (type AGE). Then click on Type&Label.  For the label type Respondent’s Age in the box. Then choose Numeric for Type. Press Continue. Type in the Numeric Expression: 2002- Byear.

 

To create variable MAG, repeat the above. In the Numeric Expression type:

 max(TIME, NEWSWEEK, MONEY, USNEWS, OTHER).

 

Recoding variable into the same variable:

Click on Transform>Recode>Into Same Variable. The Recode into Same Variable dialog box opens. Choose the variable that you want to recode, move it to the Numeric Variable box and then press Old and New Values. A new dialog box appears; enter old values or ranges in the Old Value box and new values in the New Value box.

Exercise 1: Recode BYear into itself, by changing 9999 to 9998. Categories Don’t Know and No Answer will be combined.

Caution: If you save the changes, you will not be able to recover the original values if you needed them in the future. This is why recoding variable into the same variable is not recommended unless you are positive that the original values will not be needed in future analysis of the data. If you are not sure, use recoding variable into a new variable.

Recoding variable into a new variable: Click Transform>Recode>Into Different Variable. In the Recode into Different Variables dialog box, from the list of variables select the variable that you want to recode, type in the new variable name (Output Variable) and the Label, then press Change. Click on Old and New Values. In the dialog box, type Old Values or ranges and New Values. Press Continue and then OK.

Exercise 2: Recode variable Marital into Status by combining categories 2 through 5 into a single category:

old value

new value

1

1

2-5

2

9

9

Note:  After coding 2-5 à 2, we could simply code ‘All other variables’ à ‘Copy old values’.  To add labels to the newly created variables, go to the data  view and add the variable labels and value labels.

 (iii)Combining Data Files
Merging data files with the same variables but different cases (different id’s):

Open one of the data files you want to combine. Click on Data>Merge Files>Add Cases. This opens the Add Cases Read File dialog box. Select the directory and then the data file that you want to combine with the open data file. It should have the same variables as the open data file. Click on Continue. The Add Cases From dialog box opens. If the Unpaired Variables box is empty, click on OK. It will append the selected data file to the open one. If the Unpaired Variables box is not empty, you have to make a decision. Unpaired Variables are the variables to be excluded from the new merged data file. By default this list contains variables from either data file that do not match a variable in the other data file (different names, defined as numeric in one and string in the other, string variables of unequal width). If different names or types are assigned by mistake, cancel the merge process, edit the data file with incorrect names or types, make correction and then merge the data files. You can remove a variable from the list of variables to be included in the merged file by selecting it and moving it to the Unpaired Variables list.

Exercise 3: Data files sample1.sav and sample2.sav have the same variables. Merge them. Check the result.

Merging data files with the same cases (id’s) but different variables:
The cases in the two data files to be merged must be sorted in the same order in both data files. (To sort a data file: click on Data in the main menu, select Sort Cases , choose the variable to sort by, id is a good choice.) Open one of the files you want to combine. Click on Data> Merge Files> Add Variables. Choose the data file that you want to combine with the open data file. It should have the same id’s or case numbers as the open data file. Click on Continue. This opens the Add Variables From dialog box. Excluded Variables box lists variables to be excluded from the new, merged file. By default, this list contains any variable names from external data file that duplicate variable names in the open data file. If you want to include the variable with the duplicate name in the merged data file, click on it and then rename it and add to the list of variables in New Working Data File box. Use Key Variables to indicate the variable to be used to match the cases. The Key Variable have to have the same name in both files and both files must be sorted by ascending order of the key variable. Click OK. It will add the selected file to the open one, side by side.

Exercise 4: Data files sample1.sav and sample1a.sav have the same ids. Merge them. Check the result.

A keyed table is a file in which data for each case can be applied to multiple cases in the other data file. For example, if one file contains information on individual students (name, major, class code) and the other class information (class code, instructor, department offering the course), you can use the class information file as the keyed table and use class code as the key variable to merge the data files. In the merged data file, for each case, you will have the individual student information plus class information.

(iv)Using ASCII and Spreadsheet Data Files in SPSS
Reading ASCII data files
To read an ASCII data file into SPSS you have to know variables and their column location in the file or the variables and the order in which they appear in the file. Click on File, choose Read Text Data. Open the text file you want to import and then follow steps of the Text Import Wizard.

 

Example: ASCII file

08 3 F 23
12 1 M 30
16 1 F 35

 

Variable

Column Position

Education

1-2

Marital Status

4

Gender

6

Age

8-9

Exercise 5: Read asciisam.dat data file into SPSS.

Reading Excel data files: Click File> Open> Data. Under file types choose Excel(*.xls) and then open your file. In the Open File Options dialog box check mark Read Variable Names and type in the range of your data set (including variable names, but not titles and headings) if you aren’t going to read in the whole file. Press OK. The data set will appear in the data view in SPSS. Save the file as an SPSS data file for future use.


Note: Other file types such as SAS can also be read into SPSS.

2.Exploring data
(i)Opening a data file for analysis:
Before data can be analyzed, the file containing the data has to be opened. To do this, proceed as follows: Click on File>Open>Data. Select the data file and press OK. The data file appears in the SPSS data spreadsheet.

Exercise 6: Open an SPSS data file named employee dat.sav. That file will be used for practicing in what follows.

(ii)Displaying Variable Definition Information:
Click Utilities>Variables to open the Variable Information dialog box. Click on the name of the variable for which you want to see information. The dialog box displays the labels, variable type, missing value codes and value labels.


Printing the contents of the output window: Click File>Print. The Print Output dialog box appears. Select All and number of copies and click OK. To print a part of the output, highlight that part, click File>Print, and choose Selection, then click OK.

Subsetting an SPSS data file
You can restrict your analysis to a specific subgroup of cases by specifying criteria for inclusion in the subgroup. Click on Data>Select Cases, and choose IF condition is satisfied. In the next screen, type in the condition for inclusion. In the Unselected Cases Are box choose either Filtered (unselected cases will not be included in the analysis but will remain in the data set) or Deleted (unselected cases will be deleted from the data file). Click OK.

 

For an example, we can filter our data set and select only those observations

if gender = ‘m’


(iii)Summarizing and displaying information contained in data

 

Producing Summaries for categorical variables:

Frequency Tables and Bar Charts
Click Analyze>Descriptive>Statistics>Frequencies. From the list of variables select the one for which you want to get a frequency table. Click OK. The output window displays the frequency table.

Exercise 7: Compute frequency tables for variables named gender (gender) and educ (educational level).

The frequency dialog box contains a button for specifying various Statistics and a Charts button. For selecting summary statistics and a chart click on Statistics first, make selection and then click on Charts. For a nominal categorical variable the appropriate choice is the mode as a summary measure and the bar chart for the chart.
Click Analyze>Descriptive>Statistics>Frequencies. Select a variable, click Statistics button in the Frequencies dialog box , click the check box for the Mode, then Continue. Click Charts, then Continue.

SPSS processes the frequencies request and sends the frequency tables to the output window.  To edit a chart and make changes, double click anywhere on the chart.  A chart window appears; use its menu to make changes to your chart. The changes made will appear also in the output file.  You can save them by saving the output file.  If you do not want to save the whole output, delete the parts that you do not need by highlighting them in the output navigator and then selecting Edit and Cut.

 

Producing Summary Statistics for Continuous Variables:

 A frequency table or a bar graph showing every value of a continuous variable such as age is too lengthy and do not summarize information. However you can easily see the distribution of its values by producing a histogram and computing some summary statistics: mean, median, standard deviation, minimum and maximum.

Exercise 8: Use variables salary (current salary) and salbegin (beginning salary) to practice what follows.

Click Analyze> Descriptive Statistics> Frequencies Suppress the display of frequency tables by deselecting Display Frequency Tables. Press OK. Select a continuous variable from the variable list (for example age). Click on Statistics and select Mean, Median, Std. Deviation, Maximum and Minimum. Click Continue. Click on Charts, then select Histogram. If you want to compare the distribution of your variable with a normal curve, click also on With normal curve to have SPSS display a normal curve superimposed over the histogram. Then click Continue, and OK.


Descriptives:

If you only want summary statistics and do not need frequency tables or graphics, you can use the Descriptive Procedure. Do the following:

Click Analyze>Descriptive Statistics>Descriptives. A dialog box appears. Select the variables you need the summary statistics for, move them to the variable box, click options, check the statistics that you want, click Continue and then OK. The results will be displayed in the output window.

Compare Means
If you want to compute means by subgroups, you can use the Means procedure. Choose Analyze> Compare Means> Means. In the Means dialog box, choose the variables for which you want to compute the means and move them to the Dependent List box. Select the variable for grouping the data and move it to the Independent List box. Click OK Subgroup means for each dependent variable are computed for each category of the independent variable. You can specify additional layers of independent variables by pressing Next and choosing a variable that will further subdivide the data.

Producing Crosstabulation tables
A crosstabulation table displays a joint frequency distribution for more than one variable. To produce crosstabulation do the following:
Click on Analyze>Descriptive Statistics>Crosstabs. Select the row variable and move it in the Row box, then select the column variable and move it in the Column box. Click on Cells. In the Crosstabs: Cell Display, check Observed for counts (that will print the number of observations in each cell) and in Percentages box check the percentages that you want to be computed. Then click Continue and OK. The crosstabulation table will appear in the output window.

Exercise 9: Compute a crosstabulation table for educ (educational level)  by gender (gender). Recode educ into a new variable educat by combining educ values over 12. Rerun crosstabultion for gender by educat.

Creating Charts
Charts can be created from Graphs main menu. 

Creating Pie Charts: Click Graphs>Pie.

SPSS opens the Pie Chart dialog box. Check Summaries for groups of cases and Define. In the next dialog box check N of cases. Then from the list of variables select a variable whose categories will define slices of the pie. Move it to Define slices by: box. Click on OK. SPSS draws the chart and displays it in the output window.

Exercise 10: Use minority (minority classification) with Summaries for groups of cases
to create a pie chart.

To modify the chart, double click on it. A chart window appears; use its menu to make changes. To add a title, click Chart from the main menu, then on Title Type your title in the title box. Choose Title justification, press OK.
To change label formats: Click on Elements>Show data Labels.  Then click on the Position dropdown list, scroll down and select Numbers inside, text outside. For values format, select decimal places 0.Click Continue, OK

Changing the chart type: To change a pie chart do the following: Click on Transform>Simple Bar. From the selection menu, choose the bar chart type you want. Click on Replace.   

 

Creating a cluster bar chart:
Information contained in a crosstabulation table can be displayed in a Clustered Bar Chart. In the data editor window, click on Graphs> Bar. From the Bar Charts box select Clustered, then Define.

Select a variable that would define categories and another variable that would define clusters (bars within a category).Click on options and deselect Display groups defined by missing values. Click Continue, OK.

 

Exercise 11: a) Use Summaries for group of cases.  Produce bar chart for educat (educational level), clustered by gender (gender).

 

b) Use Summaries of separate variables.  Produce bar chart for salbegin (beginning salary), salary (current salary) clustered by gender.
 

Multiple response Variables:
Question 5 in the attached survey is a multiple response question. You can produce a concise summary by using Multiple Response option in Statistics menu.


Exercise 12: Produce frequency tables for question 5. Analyze>Mult Response>Define Sets. Choose all the variables of question 5. Put them into Variables in Set box. Choose Dichotomies and type 1 for Counted value. Give name and label to your multiple response set: Q5, Magazine Reading. Click on Add, and then CLOSE. Go to Multiple Response again and this time choose Frequencies. Move the multiple response set Q5 into Table for box. Press OK.

 


SAMPLE QUESTIONNAIRE

1. Case Number .......... __ __ __

2. Respondent’s Sex:
Male......... M  Female......... F

3. Are you currently -- married, widowed, divorced, separated, or have you never been married?


Married ............ 1

Widowed.......... 2

Divorced…….. 3

Separated ......... 4

Never married... 5

 

4. What is your year of birth?
Year........ __ __ __ __

 

5. Which of the following magazines do you read regularly?
(Check all that apply)


__ Time            __ Newsweek              __ Money       

__ U.S. News __ Other