|
LAB6
The objectives of this lab are:
Learn to use basic BLAST tools to identify similar polypeptides and DNA
sequences.
Align 2 sequences to one another and evaluate the statistical
significance of such alignments
Algin multiple sequences (polypeptides)
Interpret multiple alignments in terms of protein functional regions.
You have completed the DNA sequence of a segment of genomic DNA from Drosophila
melanogaster, a fruit fly. This DNA sequence is presented to you in a standard
DNA sequence format in a file titled lab6.tfa. This genomic
region encodes a polypeptide critical to basal transcription and therefore to cell
function. Please perform appropriate searches, programs, etc, as below. Do
each of these in the EASIEST way possible, no restrictions, unless otherwise stated.
1. Determine the regions encoding the polypeptide product from this gene. Prepare
a 1-line description that defines the coding sequence (eg. CDS 5-77, 137-737.) This
will be submitted.
2. Perform a search, using an appropriate BLAST tool, to find 100 polypeptides
similar in sequence to the product of this gene that are contained in the nr polypeptide
database. I recommend that you save the result of this search to disk as you will
need it for several questions. Submit the gi number of the
highest-scoring polypeptide that IS NOT the given polypeptide.
3. One high-scoring polypeptide that you SHOULD find in this alignment is from Saccharomyces
cerevisiaie. For this polypeptide, perform the following using GCG
programs:
a. Align the Drosophila and Saccharomyces
polypeptides along their lengths. EVALUATE the statistical significance of this
alignment. Submit a 1-line "decision" on whether this alignment is
significant or not and the basis.
b. Align the DNAs of the ORFs encoding these
polypeptides along their lengths. EVALUATE the statistical significance of this
alignment. Submit a 1-line "decision" on whether this alignment is
significant or not and the basis.
c. Find the single segment of highest similarity between these
two ORFS AND EVALUATE the statistical significance of this alignment. Compare the
length of this segment to that of the alignment in 3b, above. Submit a
1-line "decision" on whether this alignment is significant or not and the basis.
Include a very brief, numerical comment on the relative lengths of part 3b and 3 c.
4. Use the multiple alignment package of your choice (specify which you
used) to align the similar polypeptides that you should find from:
- Drosophila melanogaster
- Homo sapiens
- Xenopus laevis
- Arabadopsis thaliana
- Saccharomyces cerevisiae
a. Save this alignment in some text
format. You will submit this alignment.
Now, add the Archaebacterium sequence from Sulfolobus shibitae to your
alignment.
b. Save this alignment in text form for
submission.
5. The 3D structure of the human protein has been determined. Use RasMol to
examine the structure of the human polypeptide.
In RasMol, represent the protein in backbone
and the DNA in spacefill modes. Generate 2 different figures:
a.
A .gif file, named yourname6a.gif, that shows those
regions of the polypeptide backbone conserved in the eukarya from part 4a. (Colored
distinctively)
b.
A .gif file, named yourname6b.gif, that shows those
regions of the polypeptide conserved in ALL of the sequences (even
Archaea) from part 4b.
c.
You should make a 1 or 2 sentence conclusion about the differences
that you observe and their locations in the polypeptides. Include a 1-sentence
statement that includes an example of a NONCONSERVATIVE amino acid substitution in one of
the conserved regions and the effect that you expect this would have on the mutant
polypeptide.
What does the ideal submission look like?
1. CDS 45-135, 800-1100
2. gi=123456
3. a. Alignment is NOT significant because (and this is NOT a good reason) the aligned
regions are shorter than 10 amino acids
b. Alignment IS significant because (and this is NOT a good
reason) the aligned regions are divisible by 3.
c. Alignemnt IS NOT significant because (and this is NOT a good
reason) I did it on Tuesday, which was an odd-numbered date.
4. Each of these outputs will be about 2 pages of text.
I used INSPIREALIGN, a method in which the alignment magically appears projected on the
inner surface of one's eyelids.
a.
Dm TYGACFlllvvcAG
Hs TYGACFlavcacAG
Xl TYGACFvavcacAG
Hs TYGACFlavcatAG
At TYGACFlavcatAG
Sc TYGACFlalsatAG
b.
Dm TYgaCFlllvvcAG
Hs TYgaCFlavcacAG
Xl TYgaCFvavcacAG
Hs TYgaCFlavcatAG
At TYgaCFlavcatAG
Sc TYgaCFlalsatAG
Ss TYppCFlllvvcAG
5. Two beautiful pictures and one clear statement
A. |
B. |
C. The more diverse set of polypeptides conserves only a subset of
the amino acids conserved in eukarya because this region is most likely the part of the
molecule that binds to frizzem-frazzem, the only substrate in common with all 7
polypeptides. Changing ARG273 to proline would cause a kink in the polypeptide that
would not allow it to bind frizzem-frazzem any more. |
|