SAS Basics:  An Introduction to SAS System for Windows

 

SSTARS Center

2-2 William T. Young Library

 

 Consultants: 

Aric Schadler (schadler@uky.edu)

257-2641

Adam Lindstrom (adam.lindstrom@uky.edu)

257-2643

 

 

References:

SSTARS web page:  www.uky.edu/ComputingCenter/SSTARS

SAS home page:  www.sas.com


Introduction and Overview of the SAS System

 

SAS at one time stood for Statistical Analysis System.  Currently the letters SAS stand alone.  The reason for this is that the company wanted to market SAS as more than solely a statistical application.  The SAS System now has more functionality than just statistics, but its primary application is still statistical analysis.

 

The SAS Institute has a diverse family of products, most of these products are integrated; that is, they can be put together like building blocks to construct a seamless system.

 

Some common SAS modules in the SAS System:

 

Base SAS (Base product needed to run most SAS products.  Includes the DATA Step and basic utility Procedures)

SAS/STAT (Statistical Analysis)

SAS/GRAPH (High-resolution plots, charts, and maps)

SAS/GIS (Geograpical Information System)

SAS/ASSIST (Menu-driven front end to SAS)

SAS/ACCESS (Read data stored in an external data base such as DB2 or Oracle)

SAS/OR (Operations research and project management)

SAS/QC (Quality improvement)

SAS/ETS (Business planning, forecasting, and decision support)

SAS/FSP (Data entry)

SAS/AF (Create your own interactive SAS applications)

SAS/IntrNet (Software allows you to effectively deliver your SAS applications to the web)

Enterprise Miner (Provides an easy-to-use front-end to the SEMMA(Sample, Explore, Modify, Model, Assess) process for business users)

SAS/Warehouse Administrator (Simplifies the creation and maintenance of data warehouses)

 

Other SAS Products:

 

Enterprise Guide

Enterprise Reporter

 


SAS Windows Environment

 

 

Enhanced Program Editor – Text editor.  Enter, edit, and submit programs or edit other text files such as raw data.  SAS color codes different parts of the program. 

(program files have extension *.sas)

 

Log – Contains notes about your SAS session and notes, errors, or warnings after you submit a SAS program.  (log files have extension *.log)

 

Output – Any printable results generated by a program will appear in the Output window.  (output files have extension *.1st)

 

Results – Table of contents for your Output window – the result tree lists each part of your results in an outline form.

 

Explorer – Gives you easy access to your SAS files and libraries.

 

Favorite Folders – Similar to windows explorer.


SAS Language

 

With SAS, you use statements to write a series of instructions called a SAS program.  The program communicates what you want to do and is written using the SAS language.  There are some menu-driven front ends to SAS:  SAS/Assist and Enterprise Guide.  Which make SAS appear like a point and click application.  However, these front ends still use the SAS language to write programs for you.  You will have much more flexibility using SAS if you learn to write your own programs using the SAS language.

 

A SAS program is a sequence of SAS statements executed in order that give instructions to SAS.  There are very few rules for the SAS language.  The most important rule is:  Every SAS statement ends with a semicolon.

 

SAS programs are constructed from two basic building blocks:  DATA steps and PROC steps.  A typical program starts with a DATA step to create a SAS data set and then passes the data to a set of procedures.

 

 

DATA step   :               DATA distance;

                                    Miles = 26.22;

                                    Kilometer = 1.61*Miles;

                                    Run;

 

PROC step   :  PROC PRINT DATA = distance;

                                    Run;

 

 

DATA steps                                                     PROC steps                                        

-         begin with DATA statements                 - begin with PROC statements

-         read and modify data                            - perform specific analysis or function

-         create a SAS data set                           - produce results or report

 

 

Submitting your program

 

To submit a program in the Program Editor you can either click on the submit icon (the runner figure) on the tool bar, select Submit from the Run pull down menu, or simply press F3.

 

Note: You can submit part of a program by highlighting the part of the program you want to run in the PROGRAM EDITOR prior to submitting it.

 

Clearing Windows

 

Every time you run a SAS program, SAS writes messages in your LOG window and if there are any results they will be written to the OUTPUT window and added to the contents of the RESULTS window. SAS continually appends to the SAS LOG and OUTPUT windows. To clear the LOG window, click in the window and then click on the new icon (blank sheet of paper) on the tool bar. To clear the OUTPUT window, click in the window and then click on the new icon (blank sheet of paper) on the tool bar or highlight the parts of the output you want deleted in the RESULTS window and then hit delete. To clear the PROGRAM EDITOR right-click in the window and select clear all.

 

SAS DATA Sets (Tables)

 

Variables and observations  

Variables (Also Called Columns)

Observations (Also Called Rows)

 

     Id

Name

Height

Weight

53

Susie

42

41

54

Charlie

46

55

55

Calvin

40

35

56

Lucy

46

.

 

Data Types:

            Numeric: numbers, can be positive (+) or negative (-) signs,

                             Decimal points (.) or E for scientific notation

            Character: may contain numerals, letters or special characters (such as $ or !)

            Missing Data: is represented by a period (.)

 

Rules for SAS data sets and variables

·        Names must be 32 characters or fewer in length

·        Names must start with a letter or underscore (_)

·        Names must contain only letters, numerals or underscores

·        Names can contain upper and lowercase letters

 

SAS data sets have the extension      *.sas7bdat

 

The DATA Step’s Built-in Loop

           

DATA steps execute line by line and observation by observation.

 

Input data set                                     DATA step                             output data set

Observation1                                        Line1                                        Observation1

Observation2                                        Line2                                        Observation2               

Observation3                                        Line3                                        Observation3

                                                            Line4

                                                            Line5 

 

 

 

 

Creating a SAS data Library

 

A libref is a nickname that corresponds to the location of a SAS data library (A folder on your computer where SAS data sets are either written to or read from)

 

There are two ways of creating a SAS data library:

 

(1) Submit a LIBNAME statement in a SAS program

      LIBNAME libref ‘drive:\directory’

 

(2) Click on the new library icon (filing cabinet) on the Tool bar and give the libref and the pathname

 

 

 

To avoid having to recreate the same SAS data library every time you start a new SAS session you can check Enable at Startup so the SAS data library is created every time at start up.

 

Temporary versus Permanent SAS Data Sets

 

Temporary SAS data sets: Exist only during the current SAS session. Erased when the SAS session is closed. Temporary data sets are stored in the WORK data library that is created at start up of the SAS session. One level name or WORK.name. Takes time to create every time you start SAS. They use less hard-drive space.

 

Permanent data sets: SAS data set files created on your computer system. They can be reopened in future SAS sessions. Two level libref.name. Use much more memory on the hard-drive.

 

DATA statement                    Libref              Member name            Type

DATA distance;                       WORK            distance                        temporary

DATA mysaslib.hgt_wgt;          mysaslib           hgt_wgt                        permanent

DATA WORK.distance;          WORK            distance                        temporary

 

Creating a permanent SAS data set

LIBNAME mysaslib ‘d:\temp’;

DATA mysaslib.hgt_wgt;

            Id=53;

            Name = ‘Susie’;

            Height= 41;

            Weight= 42;

run;

 

 

Reading a permanent SAS data set

 

PROC PRINT DATA = sashelp.air;

run;

 

Entering the data with the View table Window

 

One method of getting your data into SAS is using the Table editor. To open the SAS table editor select Table editor from the Tools pull down menu.

 

 

 

Before entering your data you should create column (variable) attributes. Open the Column Attributes for a column by right clicking on the letter at the top of that column and then choose Column Attributes

 

        

 

 

After entering the Column Attributes for all columns (variables) then proceed to enter your data in the cells of the table. To save the table (data set) select save as in the file pull-down menu. The following window will appear:

 

 

Either select one of the current SAS data libraries or create a new one by clicking on the Create new library icon button (filing cabinet). Then enter the member name (data set name) and click on save. If save in work library it will be temporary.

 

           

Adding rows to an existing table using the table editor

 

To add observations to an existing table, open the existing table in the table editor, choose edit module in the Edit pull-down menu. The choose add row, enter data in row and then choose commit new row.

 

 

Options under the Data pull down menu

 

Where - use to subset table in the table editor

Where clear – undoes the subset table and restores the original table

Hide/Unhide – hides and unhide column

Sort – sorts data in ascending or descending order by columns

Options: Nonduplicate – deleted duplicate rows where every column of the rows match                          Nodupkeys – deletes duplicate rows where all the BY columns match

Hold – Holds selected columns in place as you scroll to the right in the table editor

 

Examples:  sashelp.prdsal3

 

 

Listing the contents of a SAS data set

 

PROC CONTENTS DATA = data-set; run;

 

Example:

 

PROC CONTENTS DATA = sashelp.buy;

run;

 

 


SAS expressions and SAS functions

 

You can create and redefine variables with assignment statements using this basic form:

 

Variable = expression;

 

On the left side of the equation is a variable name, either new or old. The right side of the equation may be a constant, another variable, or a mathematical expression.

              * = multiply

** = exponentiation

   / = division

   - = subtraction

   + = addition

 

 

Assignment Statement                                               Type of expression   

 

Qwerty = 10;                                                               numeric constant

Qwerty = ‘ten’;                                                            character constant

Qwerty = Oldvar;                                                         a variable

Qwerty  = Oldvar*10-20                                             mathematical expression

 

 

 

Functions perform a calculation on, or a transformation of, the arguments given in parentheses following the function name. General form:

 

Function-name (argument, argument …)

 

Example:

 

DATA test;

            w = 2;

             x = 2;

             y = 4*x*w*;

             Z=sqrt(y);

run;

 

PROC PRINT DATA = test; run;

 

 

For a complete list of SAS Functions check the SAS Online Documentation or the SAS Language Reference Manual.

 

SAS Informats, Formats and SAS Dates

 

A Format is an instruction that the SAS system uses to write data values. SAS will decide which format is best unless you specify a format for a particular variable.

 

There are three general types of formats:

Character                   Numeric                      Date

$formatw.                     formatw.                       formatw.

 

Note: You can create your own formats using PROC FORMAT to define them.

 

Example:

                        DATA commas;

                                    X=12568933;

                                    Format X comma12.;

                        run;

                        PROC PRINT DATA =commas; run;

 

An Informat is an instruction that the SAS system uses to read data values into a variable. Informats are useful anytime you have non-standard data. (Standard numeric data contain only numbers, decimal points, minus signs, and E for scientific notation.) Numbers with embedded commas or dollar signs are examples of non standard data.

 

There are three general types of Informats:

 

Character                                           Numeric                                  Date

$informatw.                                          informatw.                                informatw.

 

 

Example:

 

DATA dollar;

            INPUT MONEY comma10.;

CARDS

$1,000,000

;

RUN:

 

PROC PRINT DATA = dollar;

RUN;

 


A SAS DATE variable is a numeric variable whose value is the number of days since January 1, 1960.

 

 

Date                                                    SAS sate value

January 1, 1959                                    -365

January 1, 1960                                       0

January 1, 1961                                    366

January 1, 2001                                    14976

 

 

For a variable to become a SAS date variable, the variable has to be read in as a with a SAS date informat or assigned a value of a SAS date consatant.

 

 

SAS date functions can be used to perform a number of handy operations.

 

Example:

 

DATA DATE;

            INPUT JULY_4 mmddyy10.;

CARDS;

07-04-2006

;

RUN;

 

DATA DATE;

SET DATE;

DAYS_JULY4=JULY_4-TODAY();

CHRISTMAS=’25DEC2006’D;

DAYS_C=CHRISTMAS-TODAY();

DAYS_NY=DAYS_C+7;

NEWYEAR=CHRISTMAS+7;

FORMAT CHRISTMAS JULY_4 NEWYEAR ddmmyy10.;

run;

 

PROC PRINT DATA=DATE; RUN;

 

 

*we could also use the following line instead to print out the date in words:

FORMAT CHRISTMAS JULY WORDDATE.;

 

 

For a complete list of SAS Formats and Informats check the SAS Online Documentation or the SAS Language Reference Manual.

 

 

SAS Data Set Options

 

KEEP = variable-list                            tells SAS which variables to keep.

DROP = variable-list                           tells SAS which variables to drop.

RENAME = (oldvar=newvar) tells SAS to rename certain variables.

FIRSTOBS = n                                   tells SAS to start reading at observation n.

OBS = n                                              tells SAS to stop reading at observation n.

IN = new-var-name                             creates a temporary variable for tracking whether                                                                      that data set contributed to the current observations.

 

Example

 

DATA weight;

            SET sashelp.class(rename=(name=firstname) drop=height firststobs=2 obs=5));

RUN;

 

 

PROC PRINT DATA = weight;

RUN;

 

 

Import/Export wizard

 

 

 

Import/Export wizard is a Graphical User Interface (GUI) that can be used to convert other files such as Excel, Access, dBase, Lotus Spreadsheet and delimited files into SAS data sets and vise versa.

 

The Import/Export wizard can be opened by choosing import/export wizard under the file pull down menu.

 

Example: sashelp.zipcode

 

ODS (Output Delivery System)

 

The output delivery system which is new with SAS version 7 can create the following types of output files: HTML, PDF and RTF

 

Examples:

ODS HTML BODY=’d:\temp\class.html;

PROC PRINT DATA=sashelp.class;

ODS HTML CLOSE;

 

ODS PDF BODY=’d:\temp\class.pdf;

PROC PRINT DATA=sashelp.class;

ODS PDF CLOSE;

 

ODS RTF BODY=’d:\temp\class.rtf;

PROC PRINT DATA=sashelp.class;

ODS RTF CLOSE;

 

 

More advanced programming techniques:

            SAS Macro Language

            SAS SQL (Structured Query Language) Procedure