NCBI
Home
Up

GCG Program Basics

The GCG programs are very well described by the information available by typing genhelp at the command prompt.  A few introductory words are quite helpful in demystifying the programs generally.

The GCG programs are a set of programs that can be purchased and installed on a UNIX system.  The programs are NOT part of UNIX, they are programs that can run in UNIX.

Two aspects of basic GCG programs trouble naive users:  How do I run them?  How do I specify options?

GCG programs can be run in two general ways:

    Interactively
    Command Line

What I mean by this is that you can start most GCG programs simply by typing their name.  Alternatively, you can specify additional parameters by adding "flags" to the initial command.  Flags are specified by preceeding them with a "-" sign.  For example,
    reverse -beg=5 -end=57 staben.seq
will reverse  the sequence staben.seq beginning at nucleotide 5 and ending at 57.   You will still be prompted for the output file name (staben.rev is the suggested default). 

If you do not add flags, the program will ask you for input, although not all possible inputs may be prompted.  Usually, the program supplies a "default" suggestion for each prompted input.  genhelp provides a typical session with each program and a list of input flags.

One source of confusion in Genhelp is that the flags are all given in the format:
-MISmatch=1  to specify this, you need to type only what is in capitals, followed by the equals sign, if present, and the number.   For example,
        findpatterns -mis=1 mydna.seq initiates a search of the file mydna.seq with up to 1 mismatch for a pattern that you will enter from the keyboard when prompted.

A very useful general option is:
   -che    prints a summary of the command line parameters
This flag, and a few others, are unusual in that there is no = sign followed by the option parameter!

GCG programs generally run immediately after the last necessary parameter has been entered.  About 1/2 the programs show you output during the run, about 1/2 of the programs do not show you such output.  This option can actually be set from the command line for most programs, but the assumed default is usually the desired option.

The output of your program will generally be stored in a file.   The files are of two types:  text and graphics.  Text files can be viewed with the command more outputfilename.  These files are saved in your UNIX directory.  They can be downloaded.

Graphics files are problematic.  You will never see a meaningful display of such information during a Telnet session.  You may not even see a file generated if you do not redirect output, as described in GCG graphics!  Viewing such files is covered in a special topic.

Most GCG programs operate on either DNA sequences (as .seq files) or polypeptide sequences (.pep files).  Some GCG programs require additional data, such as a genetic code table, an alignment scoring matrix, or a list of restriction enzyme cut sites.  For many such programs, you can modify those data or specify alternate data sets to alter the program's operation.  fetch is a program that can find the data files, which all have names like *.dat so that you can view the files and see the options.  I will add a section on data files later.
    -

University of KentuckyMorgan School of Biological SciencesNSF-CCD Support wpe1.jpg (5798 bytes)Chuck Staben, copyright reserved || 09/21/98