BCH 401G Lecture Notes: Transcription (RNA Synthesis)
Central Dogma of Molecular Biology:
How does the sequence of a strand of DNA correspond to the
amino acid sequence of a protein? This concept is explained
by the central dogma of molecular biology, which states that:
Flow of genetic information in normal cells:
(--<-->---) -DNA ---------------------> RNA
Why would a cell want to have an intermediate between DNA
and the protein it encodes?
The DNA can stay pristine and protected, away from the caustic chemistry of the cytoplasm.
Gene information can be amplified by having many copies of an RNA made by one copy of DNA.
Regulation of gene expression can be effected by having specific
controls at each element of the pathway between DNA and proteins.
The more elements there are in the pathway, the more opportunities
there are to control it in different circumstances.
We will now look at the transcription of DNA to RNA.
-We will mainly discuss the process in bacteria.
-Transcription in eukaryotic cells is similar - will discuss
Transcription: Information stored in the sequence
of DNA is read. Mechanistically, transcription is similar to DNA
replication: uses nucleotide triphosphates and template directed
growth in 5' to 3' direction.
2 major differences:
1) Only one DNA template is transcribed (single stranded
poly-ribonucleotide chain is synthesized).
2) Only a small fraction of the total genetic potential
of an organism is used in any one cell.
The reaction is thermodynamically favorable: Hydrolysis of the
terminal phosphoanhydride bond of nucleotide triphosphate yields
13 kJ/mol more energy than is necessary for formation of a phosphodiester
linkage within the RNA backbone (remember back to our discussion
in earlier lectures).
Classes of RNA:
1) Messenger RNA (mRNA): carries information which will
be translated into protein sequence by ribosomes. 3% of total
RNA in bacteria is mRNA.
2) Ribosomal RNA (rRNA): part of the ribosome which is
involved in the translation of mRNA to protein. 83% of bacterial
3) Transfer RNA (tRNA): involved in the translation of
mRNA into protein. 14% of total bacterial RNA.
4) Small RNA's: Involved in the modification of some RNA's
after transcription. <1% of Total RNA.
Define a new term: GENE
A Gene is any portion of the DNA sequence which is transcribed
into RNA. Gene's can be transcribed into any of
the classes of RNA that we just discussed.
Terminology and numbering of Gene Sequences:
1). DNA is indicated in a 5' to 3' direction along its top
(or coding) strand and 3' to 5' along the bottom (TEMPLATE
or noncoding) strand.
If this DNA sequence is capable of being transcribed to RNA, the
sequence would be termed a "gene" and the RNA would
be written as the 5' to 3' TOP or CODING strand sequence.
Coding Strand: Identical to the RNA transcript.
Template Strand: Serves as the template for making
the RNA transcript and is complementary to that of the RNA transcript.
2). Numbering system
Transcription Start Site. Nucleotide in DNA coding strand
corresponding to the first nucleotide of the transcribed RNA is
Nucleotides to the right of the start site (+1) toward 3' end
on coding strand are indicated by increasing positive numbers
(+ 2, 3, 4, 5, etc.).
Nucleotide directly to the left of the +1 nucleotide (start site)
is defined as -1, and the next is -2, -3, etc. There is no zero
between -1 and +1.
3). Promoter Sequences: Each gene has sequences
which are important for controlling its expression. These
are termed 'promoter sequences."
Usually found at the 5' end of the gene, relative to the coding strand.
In the numbering system, these promoter sequences have negative numbers.
Enzymology of RNA Synthesis: RNA POLYMERASE
A single RNA polymerase functions in bacteria.
In eukaryotes, three distinct RNA polymerases are responsible
for the synthesis of each class of RNA: rRNA, mRMA, and small
DNA polymerase and RNA polymerase Catalyze Similar Reactions:
Vmax DNA pol III 500-1000 nucleotides/sec
Vmax RNA pol. 50 nuc./sec
10 molecules of DNA pol./cell, 3000 molecules of RNA polymerase
(~50% involved in making RNA at any one time). DNA replication
is fast but initiates at a few sites, RNA transcription is slow
but occurs at many sites of initiation and so accumulates to high
RNA polymerase is highly processive (like DNA pol.). So once
initiated, it will not dissociate until a specific termination
signal is received.
See Figure 27.4
Another difference is that RNA polymerase is much less accurate.
RNA Polymerase is an Oligomeric Protein:
5 separate protein subunits comprise RNA Polymerase in bacteria:
2 copies of a, b,
b', s, and
The sigma subunit can be removed from RNA polymerase and leave
the rest of the complex intact.
Can then test the binding affinity of the entire complex and
the Core complex (lacking sigma) for general DNA and "Promoter"
DNA (which contains -10 and -35 consensus sequences).
Kassoc. Values for: Any DNA Promoter
RNA polymerase (- sigma) 1 x 1010 M-1 1 x 1010 M-1
RNA polymerase (+ sigma) 5 x 106 M-1 2 x 1011 M-1
Sigma Factor does two things:
1). Decreases affinity of RNA polymerase for general DNA (by
4 orders of magnitude).
2). Increases affinity of RNA polymerase for promoter DNA (by
1 order of magnitude).
Function of sigma subbing is to interact with -10 and -35 consensus
sequences so that polymerase can bind to promoter region of genes.
STEPS OF TRANSCRIPTION:
1). Binding of RNA polymerase to Promoter Sequences:
In E. coil there are two regions that are similar in all promoters.
One sequence is centered at -10 and the other -35 relative to
the transcriptional start site at +1.
For -10 and -35 sequence, can identify nucleotides that are
usually found at each position in these promoters.
Called "Consensus Sequence".
For -10 region the consensus sequence is; 5' TATAAT 3', often
called "TATA" box for this reason.
For -35 region the consensus is 5' TTGACA 3'.
The nucleotide at the transcriptional start site is almost
ALWAYS A PURINE (A or G), most often an Adenine.
Promoter recognition is a critical step in transcription, for
regulation as well as mechanism. This is because promoter recognition
is a rate-limiting step for transcription. Because all genes
are transcribed by the same protein complex, differences in promoter
structure must be largely responsible for differences in frequency
of initiation ( as rapid as 1/10 sec to 1/per generation 30-60
SEE Figure 27.7
How does RNA polymerase find promoter DNA sequences?
RNA polymerase binds to DNA at random sites and moves quickly
along the DNA while the sigma factor scans for promoter sequences.
Once a promoter is reached, sigma subunit binds promoter sequences
with high affinity and prevents the polymerase from scanning any
Why use a scanning mechanism? Because it is much faster than
a random association/dissociation search which is diffusion controlled
and therefore a second-order reaction (Maximum rate 108 M-1 S-1).
The scanning scheme is essentially first order and has a rate
constant of 1010 M-1 S-1,. This is two orders of magnitude faster
than a bind/release search.
2). Initiation of Transcription.
See Figure 27.6
A). RNA polymerase associates with promoter sequences near
the +1 Transcription start site.
This is called a "CLOSED PROMOTER COMPLEX" because
the DNA at the Transcription start site is still double stranded.
See Figure 27.12 and 27.6
B). Polymerase then unwinds the DNA at the Transcription start
site to make it single-stranded.
This complex is termed the "Open Complex"
because DNA is unwound.
17 base-pairs of DNA is unwound, forming a "Transcription
RNA polymerase now starts to synthesize the RNA transcript.
RNA polymerase has two binding sites for ribonucleoside triphosphates,
the FIRST is used during elongation, binds all of the 4 common
ribonucleoside triphosphates with a half saturating concentration
of 10 mM.
The SECOND, used for initiation, binds ATP and GTP preferentially
at 100 mM. Thus, most RNA have a purine
at the 5' end.
Chain growth begins with binding of the template specified rNTP
at the initiation site, followed by binding of the next nucleotide
at the elongation site. Next, nucleophilic attack by the 3' hydroxyl
of the first nucleotide on the a (inner)
phosphorus of the second nucleotide generates the first phosphodiester
bond and leaves an intact triphosphate at the 5' position of the
RNA polymerase moves in 5' to 3' direction (relative to coding
strand) and continues synthesizing RNA off the DNA template strand.
"Transcription Bubble" moves down the DNA helix in
concert with the new synthesis.
Within the "Bubble" only 12 nucleotides of the DNA
template strand are base-paired with the RNA strand at any time.
Called the "RNA:DNA hybrid".
See Figure 27.13
As each new nucleotide is incorporated, one base-pair of the
RNA:DNA hybrid at the other end of the transcription bubble has
3). Termination of Transcription.
RNA transcripts are not infinitely long. There are two ways in
which termination of transcription is known to occur.
First lets talk about pausing:
RNA polymerase an pause during transcription. Pausing occurs
at sequences rich in G/C base-pairs.
Difficult to disrupt stable G/C base-pairs to allow formation
of the transcription bubble and to release RNA:DNA hybrid.
Pausing can last from 10 seconds to 30 minutes.
Two Major Mechanisms of Transcription Termination. Simple
1). SIMPLE: Some termination sites have two shared structural features at these termination sites:
A). Two symmetrical G/C-rich sequences that in the transcript have the potential to form a stem-loop structure.
B). A downstream run of four to eight A residues.
see Figure 27.16 and 27.17
RNA polymerase pauses at first G/C rich region, this allows the
second G/C rich region of the RNA transcript to base-pair with
the first region- forming a RNA:RNA stem-loop duplex and eliminating
some of the base-pairing between template and transcript. Further
weakening, leading to dissociation, occurs when the A-rich region
is transcribed to give a series of very weak A-U bonds.
Once the RNA has dissociated from the enzyme complex, the DNA
bubble is lost and without the sigma factor the RNA polymerase
does not sufficient affinity to remain bound to the DNA duplex.
It releases the DNA, reassociates with the sigma protein and
searches for a new promoter sequence.
2). Rho Mediated: Factor-dependent termination is
more rare. The Rho protein is necessary for the termination of
see Figure 27.18
1). Polymerase pauses.
2). Rho protein recognizes and binds to a specific RNA sequence in the template and then binds to approximately 80 additional nucleotides of RNA at the pause site.
3). Rho protein terminates transcription by disrupting
RNA:DNA hybrid and in an ATP dependent process migrates toward
the 3' end of the template, displacing the RNA polymerase and
disrupting the RNA:DNA hybrid.
Differences in RNA transcription between eukaryotes and
1). Only one RNA polymerase in E. coli. There are three RNA
polymerases in eukaryotes.
2). In eukaryotes, most promoters direct transcription of only
one gene. In bacteria, several genes are often transcribed from
a single promoter. As we discussed, this type of transcriptional
unit is called an "Operon".
Gene A Gene B Gene C
3). Eukaryotic Polymerases require additional protein factors
(Transcription Factors) to bind to a promoter and initiate transcription.
4). Eukaryotic RNA polymerases must pass through the nucleosomes
that are found on all chromatin.
5). Eukaryotic RNA polymerases do not have terminator signals,
rather they proceed well past the coding region and into the 3'
noncoding region of genes. The later action of enzymes process
this RNA molecule extensively in a series of reactions that we
will discuss (capping, splicing, editing).