UK HomeAcademic Programs Athletics Chandler Medical Center Research and Graduate Studies Site Index Search UK Medical Center

P450 CODON USAGE RESULTS

or how the genome affects protein composition

 

Experimental Approach
110 non-allelic cytochrome P450 genes from man (n=30), rat (n=38), rabbit (n=24), and mouse (n=18) for which complete cDNA or gene sequences are available were analyzed.  Codon usage bias was estimated by summing the usage of the preferred codon for each of the 18 amino acids for which synonymous codons exist and expressing it as a percentage of all the synonymous codons in that gene.  Thus, genes with a high codon usage bias tend to use a subset of all possible codons (i.e., preferred codons) rather than the full range of codons available.  
Codon Usage Bias Does not Correlate with Evolutionary Age

As shown in the figure to the right, codon usage is not influenced by evolutionary age.  Thus, genes that have arisen early in evolution and have been maintained in an organism do not necessarily "optimize" their codon usage pattern (e.g., P450 families 19 and 7, shown on lower right of graph).

 

In the above figure codon usage bias (the tendency to use a limited subset of codons) is plotted against the estimated evolutionary distance of 18 P450 subfamilies.  The points on each line represent one or more P450 sequences in the respective family or subfamily; evolutionary distance represents the branch point at which a given group diverges from all other P450 groups.  Thus, the most recently evolved P450s are closest to the X origin.

Codon Usage Bias Does not Correlate with Evolutionary Conservation

It has been suggested that highly conserved proteins may exhibit greater codon usage bias than less well conserved proteins.  However, a comparison of 11 P450 orthologues between rat and man demonstrates that highly conserved orthologues exhibit no greater bias than less well conserved proteins.  This graph also demonstrates that codon usage bias is not conserved across species for orthologous P450 genes.

In the above figure codon usage bias is plotted against amino acid identity for 11 rat-human orthologues (each pair is connected by a line).  Highly conserved orthologues exhibit high amino acid identity, and are at the right of the graph, while less conserved orthologues are at the left.

Codon Usage Bias is not Tissue-Specific

Some evidence has indicated that codon usage might differ for genes expressed only in specific tissues, such as muscle or liver.  As shown to the right, an analysis of P450 genes expressed predominantly in a single tissue does not support this hypothesis.

In the above figure the average bias in P450 codon usage is shown for each tissue or organ.  Each group includes all P450s that are expressed predominantly or exclusively in that tissue or organ.  No statistically significant differences were noted.  (Steroid. = steroidogenic tissues; Other includes intestine and olfactory).

Codon Usage Bias Correlates with 3rd Position C+G Content

As shown to the right, codon usage bias increases with increasing C+G content at the codon 3rd position.  This is the 'silent position' in many codons, where it does not influence amino acid specificity.  This graph demonstrates that preferred P450 codons in these four mammals usually end in C or G.  

In the figure above codon usage bias is plotted against codon 3rd position C+G content for each of the 110 genes analyzed in this study.

Codon Positional C+G Content Correlates with Regional Genomic C+G Content

For reasons that are not yet understood, the composition of mammalian genomes is not homogeneous; some segments (isochores) are high in C+G content, while some regions are A+T rich.  As shown to the right, genes located in CG-rich segments exhibit high C+G content at the third codon position (i.e., codon usage bias, closed circles), and to a lesser extent at the first and second codon positions (open circles). 

In the figure above the C+G content at the codon third position (closed circles) and the first and second codon positions (open circles) for 31 P450 genes available at the time of this analysis are plotted against the non-exonic C+G content of these genes.  Flank + intron C+G content is taken as an indicator of the C+G composition of the corresponding region (isochore) of the genome.

Amino Acid Composition Correlates with Isochore Composition

The correspondence of C+G content in the first and second codon positions with isochore composition (shown above) suggests that genes located in regions of high C+G content should have a relative abundance of amino acids encoded by C/G-rich codons, and a relative deficit of amino acids encoded by C/G-poor codons.  As shown to the right, this holds true for the 31 P450 genes analyzed above.  As flank+intron C+G content increases so does the abundance of amino acids encoded by CG-rich codons (Pro, Ala, Arg, Gly); a corresponding decrease in amino acids encoded by CG-poor codons is also seen (Phe, Ile, Met, Tyr, Asn, Lys).

In the figures above amino acid composition is plotted against the isochore composition, as estimated by the flank+intron C+G content for the 31 P450 genes available at the time of analysis.  Amino acid content is presented as the sum of the indicated amino acids as a percentage of the total amino acid composition of each protein.

Amino Acid Composition Correlates with Codon Usage Bias

As noted earlier, codon 3rd position C+G content (or codon usage bias) correlates with regional genomic nucleotide composition.  Thus codon usage bias can be taken as a proxy for isochore composition.  This is illustrated by the figures to the right, where amino acid content correlates with codon 3rd position C+G content.  

Thus, the regional genomic nucleotide composition influences the composition of genes and, surprisingly, their encoded proteins.

In the figures above amino acid composition is plotted against the codon 3rd position C+G content, which serves as a proxy for isochore composition.  Here all 110 P450 cDNA sequences can be included, rather than just those for which gene sequences are available.  

Conclusions

 

  • Codon usage bias in mammals appears to reflect the composition of the genome in which the gene lies; genes in GC-rich regions of the genome will exhibit biased codon usage, in which a majority of the codons end in C or G.

  • This genomic influence extends to the first and second codon positions, where increased C+G content will increase those amino acids encoded by CG-rich codons (Pro, Ala, Arg, Gly) and decrease those amino acids encoded by CG-poor codons (Phe, Ile, Met, Tyr, Asn, Lys).

  • The total variation in amino acid composition between genes with high and low codon usage bias is approximately 20%, and the content of any one amino acid changes from 2-6%.  This is sufficient to alter the characteristics of the encoded protein, and reveals an important and previously unrecognized force that affects protein evolution.

 

 

Back to the P450 Codon Usage Page

 

Comments to Todd D. Porter, Pharmaceutical Sciences, University of Kentucky College of Pharmacy, Lexington, KY 40536-0082.  Phone 859 257-1137; FAX 859 257-7564
Last Modified: September 28, 2001
Copyright © 1999, University of Kentucky Chandler Medical Center; figures on this page are Copyright of Elsevier Science, B.V., Amsterdam, The Netherlands.