SPSS Basics: An
Introduction to SPSS System for Windows
The Center for Statistical Computing Support
2-2 Willaim T. Young Library
Open: 8am – 5pm Monday – Friday
Consultants:
257-2641
Adam Lindstrom (adam.lindstrom@uky.edu)
257-2643
References:
SSTARS web page: www.uky.edu/ComputingCenter/SSTARS
SPSS home page:
www.spss.com
Additional Resources:
|
Variable(question) |
Type |
Variable label |
Values and value
labels (data code) |
Measure |
|
ID |
Numeric |
Case Number |
1--- 999 |
Nominal |
|
Gender |
Strig |
Repondent's Gender |
F=Female M=Male |
Nominal |
|
Marital |
Numeric |
Marital Status |
1 = Married 2 = Widowed 3 = Divorced 4 = Separated 5 = Never Married 9 = No Answer |
Nominal |
|
BYear |
Numeric |
Birth Year |
19-- -- 9998=DK 9999=No Answer |
Scale |
|
TIME |
Numeric |
Respondent reads TIME |
1=Yes 0=No |
Nominal |
|
NEWSWEEK |
Numeric |
Respondent reads NEWSWEEK |
1=Yes 0=No |
Nominal |
|
MONEY |
Numeric |
Respondent reads MONEY |
1=Yes 0=No |
Nominal |
|
USNEWS |
Numeric |
Respondent reads USNEWS |
1=Yes 0=No |
Nominal |
|
OTHER |
Numeric |
Respondent reads OTHER |
1=Yes 0=No |
Nominal |
If you make a mistake, correct it the same way you would in a spreadsheet; select the field you want to change, type in the new value and press enter.
Click on data view tab at the bottom left hand corner of window and proceed to enter data into cells just as you would in a spreadsheet.
To turn on value labels: Click View>value Labels.
Saving an SPSS data set: click on File; choose Save as or Save Data. Choose directory in which you want your file to reside and press OK. The file will be saved as an SPSS data file (extension .sav).
(ii)Editing and Modifying SPSS data files
To open and edit a data file click on File>Open>Data.
Go to the directory in which your file is located, select the file and press OK. You will see the file in a
spreadsheet form.
Adding new variable: Click on variable
view tab. Right click the row before which you want to insert the new
variable and choose Insert Variable.
Alternatively you can click the following icon on the toolbar. ![]()
Adding new case: Click on data view tab. Right click the row
before which you want to insert the new case and choose Insert Cases. Alternatively you can click the following
icon on the toolbar. ![]()
Computing new variables: Suppose that we want to compute two new variables and add them to the data file created from the attached surveys. The variables AGE and MAG will be defined as follows:
AGE=2002-BYear
MAG= 1 if a person reads any magazines
0 if a person doesn’t read any magazines
Click Transform>Compute. The compute variable dialog box appears. Begin by specifying a new variable name in the Target Variable box (type AGE). Then click on Type&Label. For the label type Respondent’s Age in the box. Then choose Numeric for Type. Press Continue. Type in the Numeric Expression: 2002- Byear.
To create variable MAG, repeat the above. In the Numeric Expression type:
max(TIME, NEWSWEEK, MONEY, USNEWS, OTHER).
Recoding variable into the same variable:
Click on Transform>Recode>Into Same Variable. The Recode into Same Variable dialog box opens. Choose the variable that you want to recode, move it to the Numeric Variable box and then press Old and New Values. A new dialog box appears; enter old values or ranges in the Old Value box and new values in the New Value box.
Exercise 1: Recode BYear into itself, by changing 9999 to 9998. Categories Don’t Know and No Answer will be combined.
Caution: If you save the changes, you will not be able to recover the original values if you needed them in the future. This is why recoding variable into the same variable is not recommended unless you are positive that the original values will not be needed in future analysis of the data. If you are not sure, use recoding variable into a new variable.
Recoding variable into a new variable: Click Transform>Recode>Into Different Variable. In the Recode into Different Variables dialog box, from the list of variables select the variable that you want to recode, type in the new variable name (Output Variable) and the Label, then press Change. Click on Old and New Values. In the dialog box, type Old Values or ranges and New Values. Press Continue and then OK.
Exercise 2: Recode variable Marital into Status by combining categories 2 through 5 into a single category:
|
old value |
new value |
|
1 |
1 |
|
2-5 |
2 |
|
9 |
9 |
Note: After coding 2-5 à 2, we could simply code ‘All other variables’ à ‘Copy old values’. To add labels to the newly created variables, go to the data view and add the variable labels and value labels.
(iii)Combining Data Files
Merging data files with the same variables but different cases (different id’s):
Open one of the data files you want to combine. Click on Data>Merge Files>Add Cases. This opens the Add Cases Read File dialog box. Select the directory and then the data file that you want to combine with the open data file. It should have the same variables as the open data file. Click on Continue. The Add Cases From dialog box opens. If the Unpaired Variables box is empty, click on OK. It will append the selected data file to the open one. If the Unpaired Variables box is not empty, you have to make a decision. Unpaired Variables are the variables to be excluded from the new merged data file. By default this list contains variables from either data file that do not match a variable in the other data file (different names, defined as numeric in one and string in the other, string variables of unequal width). If different names or types are assigned by mistake, cancel the merge process, edit the data file with incorrect names or types, make correction and then merge the data files. You can remove a variable from the list of variables to be included in the merged file by selecting it and moving it to the Unpaired Variables list.
Exercise 3: Data files sample1.sav and sample2.sav have the same variables. Merge them. Check the result.
Merging data files with the same cases (id’s) but different
variables:
The cases in the two data files to be merged must be sorted in the same order
in both data files. (To sort a data file: click on Data in the main menu,
select Sort Cases , choose the variable to sort by, id
is a good choice.) Open one of the files you want to combine. Click on Data> Merge Files> Add Variables.
Choose the data file that you want to combine with the open data file. It
should have the same id’s or case numbers as the
open data file. Click on Continue. This opens the Add Variables From dialog
box. Excluded Variables box lists variables to be excluded from the new, merged
file. By default, this list contains any variable names from external data file
that duplicate variable names in the open data file. If you want to include the
variable with the duplicate name in the merged data file, click on it and then
rename it and add to the list of variables in New Working Data File box. Use
Key Variables to indicate the variable to be used to match the cases. The
Key Variable have to have the same name in both files and both files must be
sorted by ascending order of the key variable. Click OK. It will add the
selected file to the open one, side by side.
Exercise 4: Data files sample1.sav and sample1a.sav have the same ids. Merge them. Check the result.
A keyed table is a file in which data for each case can be applied to multiple cases in the other data file. For example, if one file contains information on individual students (name, major, class code) and the other class information (class code, instructor, department offering the course), you can use the class information file as the keyed table and use class code as the key variable to merge the data files. In the merged data file, for each case, you will have the individual student information plus class information.
(iv)Using
ASCII and Spreadsheet Data Files in SPSS
To read an ASCII data file into SPSS you have to know variables and their
column location in the file or the variables and the order in which they appear
in the file. Click on File, choose Read Text Data. Open the text
file you want to import and then follow steps of the Text Import Wizard.
Example: ASCII file
08 3 F 23
12 1 M 30
16 1 F 35
|
Variable |
Column Position |
|
Education |
1-2 |
|
Marital Status |
4 |
|
Gender |
6 |
|
Age |
8-9 |
Exercise 5: Read asciisam.dat data file into SPSS.
Reading Excel data files: Click File> Open> Data. Under file types choose Excel(*.xls) and then open your file. In the Open File Options dialog box check mark Read Variable Names and type in the range of your data set (including variable names, but not titles and headings) if you aren’t going to read in the whole file. Press OK. The data set will appear in the data view in SPSS. Save the file as an SPSS data file for future use.
Note: Other file types such as
SAS can also be read into SPSS.
2.Exploring data
(i)Opening a data
file for analysis:
Before data can be analyzed, the file containing the data has to be opened. To
do this, proceed as follows: Click on File>Open>Data.
Select the data file and press OK. The data file appears in the SPSS data
spreadsheet.
Exercise 6: Open an SPSS data file named employee dat.sav. That file will be used for practicing in what follows.
(ii)Displaying
Variable Definition Information:
Click Utilities>Variables to open
the Variable Information dialog box. Click on the name of the variable for
which you want to see information. The dialog box displays the labels, variable
type, missing value codes and value labels.
Printing the contents of the output window: Click File>Print. The Print Output dialog box appears. Select All and number of copies and click OK. To print a part of
the output, highlight that part, click File>Print,
and choose Selection, then click OK.
Subsetting
an SPSS data file
You can restrict your analysis to a specific subgroup of cases by specifying
criteria for inclusion in the subgroup. Click on Data>Select Cases, and choose IF condition is satisfied. In the
next screen, type in the condition for inclusion. In the Unselected Cases Are
box choose either Filtered (unselected cases will not be included in the
analysis but will remain in the data set) or Deleted (unselected cases will be
deleted from the data file). Click OK.
For an example, we can filter our data set and select only those observations
if gender = ‘m’
(iii)Summarizing and displaying information contained in data
Producing Summaries for categorical variables:
Frequency Tables and Bar Charts
Click Analyze>Descriptive>Statistics>Frequencies. From the list
of variables select the one for which you want to get a frequency table. Click
OK. The output window displays the frequency table.
Exercise 7: Compute frequency tables for variables named gender (gender) and educ (educational level).
The frequency dialog box contains a
button for specifying various Statistics and a Charts button. For
selecting summary statistics and a chart click on Statistics first, make
selection and then click on Charts. For a nominal categorical variable
the appropriate choice is the mode as a summary measure and the bar chart for
the chart.
Click Analyze>Descriptive>Statistics>Frequencies. Select a
variable, click Statistics button in the Frequencies dialog box , click the check box for the Mode, then Continue.
Click Charts, then Continue.
SPSS processes the frequencies request and sends the frequency tables to the output window. To edit a chart and make changes, double click anywhere on the chart. A chart window appears; use its menu to make changes to your chart. The changes made will appear also in the output file. You can save them by saving the output file. If you do not want to save the whole output, delete the parts that you do not need by highlighting them in the output navigator and then selecting Edit and Cut.
Producing Summary Statistics for
Continuous Variables:
A frequency table or a bar graph showing every value of a continuous variable such as age is too lengthy and do not summarize information. However you can easily see the distribution of its values by producing a histogram and computing some summary statistics: mean, median, standard deviation, minimum and maximum.
Exercise 8: Use variables salary (current salary) and salbegin (beginning salary) to practice what follows.
Click Analyze> Descriptive
Statistics> Frequencies Suppress the display of
frequency tables by deselecting Display Frequency Tables. Press OK.
Select a continuous variable from the variable list (for example age). Click on
Statistics and select Mean, Median, Std. Deviation, Maximum and
Minimum. Click Continue. Click on Charts, then
select Histogram. If you want to compare the distribution of your
variable with a normal curve, click also on With
normal curve to have SPSS display a normal curve superimposed over the
histogram. Then click Continue, and OK.
Descriptives:
If you only want summary statistics and do not need frequency tables or graphics, you can use the Descriptive Procedure. Do the following:
Click Analyze>Descriptive Statistics>Descriptives. A dialog box appears. Select the variables you need the summary statistics for, move them to the variable box, click options, check the statistics that you want, click Continue and then OK. The results will be displayed in the output window.
Compare Means
If you want to compute means by subgroups, you can use the Means procedure. Choose
Analyze> Compare Means>
Means. In the Means dialog box, choose the variables for which you want
to compute the means and move them to the Dependent List box. Select the
variable for grouping the data and move it to the Independent List box. Click
OK Subgroup means for each dependent variable are
computed for each category of the independent variable. You can specify
additional layers of independent variables by pressing Next
and choosing a variable that will further subdivide the data.
Producing Crosstabulation tables
A crosstabulation table displays a joint frequency distribution for more than
one variable. To produce crosstabulation do the following:
Click on Analyze>Descriptive Statistics>Crosstabs.
Select the row variable and move it in the Row box, then select
the column variable and move it in the Column box. Click on Cells.
In the Crosstabs: Cell Display, check Observed for counts
(that will print the number of observations in each cell) and in Percentages
box check the percentages that you want to be computed. Then click Continue and
OK. The crosstabulation table will appear in the output window.
Exercise 9: Compute a crosstabulation table for educ (educational level) by gender (gender). Recode educ into a new variable educat by combining educ values over 12. Rerun crosstabultion for gender by educat.
Creating Charts
Charts can be created from Graphs main menu.
Creating Pie Charts: Click Graphs>Pie.
SPSS opens the Pie Chart dialog box. Check Summaries for groups of cases and Define. In the next dialog box check N of cases. Then from the list of variables select a variable whose categories will define slices of the pie. Move it to Define slices by: box. Click on OK. SPSS draws the chart and displays it in the output window.
Exercise 10: Use minority (minority classification) with Summaries
for groups of cases
to create a pie chart.
To modify the chart, double
click on it. A chart window appears; use its menu to make changes. To add a title,
click Chart from the main menu, then on Title Type your title in
the title box. Choose Title justification, press OK.
To change label formats: Click on Elements>Show
data Labels. Then click on the Position
dropdown list, scroll down and select Numbers inside, text outside. For
values format, select decimal places 0.Click Continue, OK
Changing the chart type: To change a pie chart do the following: Click on Transform>Simple Bar. From the selection menu, choose the bar chart type you want. Click on Replace.
Creating a cluster bar chart:
Information contained in a crosstabulation table can be displayed in a Clustered
Bar Chart. In the data editor window, click on Graphs> Bar. From
the Bar Charts box select Clustered, then
Define.

Select a variable that would define categories and another variable that would define clusters (bars within a category).Click on options and deselect Display groups defined by missing values. Click Continue, OK.
Exercise 11: a) Use Summaries for group of cases. Produce bar chart for educat (educational level), clustered by gender (gender).
b) Use Summaries of
separate variables. Produce bar
chart for salbegin (beginning salary), salary
(current salary) clustered by gender.
Multiple response Variables:
Question 5 in the attached survey is a multiple response question. You can
produce a concise summary by using Multiple Response option in Statistics menu.
Exercise 12: Produce frequency tables for question 5. Analyze>Mult
Response>Define Sets. Choose all the variables of question 5. Put them
into Variables in Set box. Choose Dichotomies and type 1 for Counted value.
Give name and label to your multiple response set: Q5, Magazine Reading. Click
on Add, and then CLOSE. Go to Multiple Response again and this time choose
Frequencies. Move the multiple response set Q5 into Table for box. Press OK.
SAMPLE QUESTIONNAIRE
1. Case Number .......... __ __ __
2. Respondent’s Sex:
Male......... M Female......... F
3. Are you currently -- married, widowed, divorced, separated, or have you never been married?
Married ............ 1
Widowed.......... 2
Divorced…….. 3
Separated ......... 4
Never married... 5
4. What is your year of birth?
Year........ __ __ __ __
5. Which of the following magazines
do you read regularly?
(Check all that apply)
__ Time __ Newsweek __ Money
__