Genome-wide polyadenylation in plants and other organisms

Most of the recent research in my lab has utilized experimental approaches for the genome-wide determination of poly(A) site choice. This research has entailed a great deal of method development (1, 2), and has integrated computational as well as wet-bench tools into integrated studies that ask questions about poly(A) site choice.

At the center of these studies are methods by which individual poly(A) sites may be identified and assessed. For this, short cDNA tags that query the mRNA-poly(A) junction are generated and sequenced, usually on the Illumina platform. The sequences so generated are analyzed using combinations of open-source and commercial software tools; the outcomes are genome-wide coordinates of poly(A) sites, and measurmeents of the relative usages of individual sites and concomitant levels of the corresponding mRNA isoforms.

Genome-wide landscape of alternative polyadenylation in Arabidopsis

A genome-wide study of polyadenylation sites in Arabidopsis (3) revealed that most genes possess more than one poly(A) site; this finding brings with it a potential for widespread alternative poly(A) site choice, whereby different sites might be utilized in different tissues, at different stages of development, or in response to various environmental cues. While the interesting proximal/distal choices that regulate 3'-UTR lengths are often seen in mammals were not seen in this study, another sort of proximal/distal arrangement was seen. In Arabidopsis, this arrangement involves the potential choice between sites that fall within 5'-UTRs, introns, or even protein-coding regions and sites that fall within 3'-UTRs (the usual location for a poly(A) site). Examples of tissue-specific poly(A) site choice in either leaves or seeds were noted int his study, supporting the proposal that alternative polyadenylation in fact occurs in plants.

Control of alternative polyadenylation by CPSF30

In Arabidopsis, CPSF30 is a potential regulatory hub (4, 5), as it interacts with calmodulin biochemically and physiologically, and is also associated with responses of Arabidopsis to oxidative stresses. In light of this, genome-wide studies have been (and are being) conducted to assess the role(s) that CPSF30 plays in poly(A) site choice. As discussed on another page , these studies reveal that numerous (>5000) individual poly(A) sites in Arabidopsis are affected by either the presence or absence of AtCPSF30 (6). Sites seen only in the mutant lacked the so-called Near-Upstream Element, suggestive of a novel class of poly(A) signal that is responsive to CPSF30. Interetsingly, both the wild-type and calmodumlin-binding CPSF30 mutant proteins restore wild-type poly(A) site profiles to the AtCPSF30 mutant (7). This indicates that the smaller of the two proteins encoded by the gene that specifies CPSF30 is sufficient for normal poly(A) site choice in Arabidopsis, and that the calmodulin-binding domain itself is not involved in mRNA 3' end formation.

Genome-wide studies of poly(A) site choice in Medicago truncatula

A genome-wide study of poly(A) site choice was conducted in Medicago trunatula, a close relative of Medicago sativa (alfalfa) and a model legume. The results of this study (8) revealed that, as in Arabidopsis, the potential for alternative poly(A) site choice is considerable, as more than 70% of M. truncatula genes possess more than one poly(A) site. Interestingly, there seemed to be little conservation of poly(A) sites that lie within 5'-UTRs and introns when homologous genes in Arabidopsis and M. truncatula were compared. However, a substantial degree of conservation was seen for sites that lie within protein-coding regions. As these sites are predicted to yield so-called non-stop mRNAs that should be unstable, this finding suggests that this class of sites may serve evolutionarily-conserved regulatory roles in gene expression in plants.

Global studies of polyadenylation in Chlamydomonas reinhardtii

Chlamydomonas reinhardtii is a unicellular alga that is widely used for basic studies as well as for numerous applications that revolve around biofuels and photosynthesis-driven biofactories. To add to the growing genome annotation for this organism, a global poly(A) site profile was assembled using the methods (1, 2) developed in my laboratory. The results of this study (9) suggest that, in contrast to what is seen in higher plants, alternative polyadenylation probably plays a limited role in gene expression in this organism. They confirm the novel and exacting poly(A) signal (UGUAA) noted by others (cited in reference 9). They also document chloroplast genome-wide poly(A) sites, and reveal a sizeable shift in the abundances of mRNA isoforms defined by these sites in different growing conditions. These latter results may be the first global assessment of chloroplast polyadenylation, and they raise the possibility that chloroplast RNA turnover (the process associated with polyadenylation in this organelle) may be a dynamic process that is responsive to environmental cues. (As an aside, it should be pointed out that this report includes a detailed description of the computational pipeline used to analyze sequencing data. This pipeline was developed in my laboratory, in collaboration with colleagues at Kentucky State University and Xiamen University.)

Genome-wide studies of poly(A) site choice in red clover (Trifolium pratense L)

A genome-wide study of poly(A) site choice was conducted in red clover (13), a an important forage legume. There was a substantial tissue-wise dynamic of alternative polyadenylation, with numerous poly(A) sites and underlying genes displaying APA in different tissues. There was an interesting differential expression of genes encoding orthologs of FIP1(V) and PCFS4, suggesting that these two factors may play a role in regulating APA in red clover. As has been seen in other higher plants, APA affects the expression of different isoforms of two key polyadenylation factors, CPSF30 and FIP1(V). Specifically, in red clover, both genes encode mRNAs that may yield small and large protein isoforms. In the case of CPSF30, the larger isoform consists of two domain, a CPSF30 domain and a domain that includes a putative YTH structural module. As discussed on the page dealing with CPSF30, this arrangement provides a conceptual link between polyadenylation and a reader of m6A modifiations of RNAs.

Transcriptomic studies

An offshoot of the methods we are using for studying poly(A) site choice is the development of very low-cost methods for preparing libraries for the more usual transcriptomics work (RNA-Seq). By modifying the poly(A) tag protocol (PAT-Seq), we can generate RNA-Seq libraries at very low cost, and with minimal bench time (typically, 8-16 hours from RNA to sequencer). This protocol is described in a book chapter (10) and has been shared with numerous laboratories. Importantly, it has been incorporated into workshops for faculty at undergraduate-focused institutions (11) and into an upper-diivision lab course (ABT495) at the University of Kentucky. It has also been used to characterize the transcriptome of red clover (Trifolium pratense; 12).

 

References

1. Ma, L., Pati, P. K., Liu, M., Li, Q. Q., and Hunt, A. G. (2014) High throughput determination of polyadenylation sites in plants .  Methods 67, 74-83.

2. Pati, P. K., Ma., L., and Hunt, A. G. (2015) Genome-wide determination of poly(A) site choice in plants. in Polyadenylation in Plants – Methods and Protocols, Methods in Molecular Biology, vol. 1255. A. G. Hunt and Q. Q. Li, eds. Springer. ISBN 978-1-4939-2174-4. pp. 159-174.

3. Wu, X., Liu, M., Downie, B., Liang, C., Ji, G., Li, Q. Q., and Hunt, A. G. (2011) Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc Natl Acad Sci U S A. 108,12533-12538.

4. Hunt, A. G. 2014. The Arabidopsis polyadenylation factor subunit CPSF30 as conceptual link between mRNA polyadenylation and cellular signaling.  Current Opinion in Plant Biology (Cell Signaling and Gene Regulation) 21C, 128-132. (request a reprint)

5. Chakrabarti, M, and Hunt, A. G. 2015. CPSF30 at the Interface of Alternative Polyadenylation and Cellular Signaling in Plants.  Biomolecules 5(2), 1151-1168. (Open Access Article)

6. Thomas, P. E., Wu, X., Liu, M., Gaffney, B., Ji, G., Li, Q. Q., and Hunt, A. G. (2012) Genome-wide control of polyadenylation site choice by CPSF30 in ArabidopsisPlant Cell 24, 4376-4388.

7. Liu, M., Xu, R., Merrill, C., Von Lanken, C., Hunt, A. G., and Li, Q. Q. (2014) Integration of developmental and hormonal signals via a polyadenylation factor in Arabidopsis. PloS ONE, 9(12): e115779

8. Wu, X., Gaffney, B, Li, Q. Q., and Hunt, A. G. (2014) Genome-wide determination of poly(A) sites in Medicago truncatula: evolutionary conservation of alternative poly(A) site choice.  BMC Genomics 15, 615.

9. Bell, S. A., Brown, A., Chen, S., and Hunt, A. G. (2016) Experimental genome-wide determination of RNA polyadenylation in Chlamydomonas reinhardtii.  PLoS ONE 11(1): e0146107.

10. Hunt, A. G. (2015) A rapid, simple, and inexpensive method for the preparation of strand-specific RNA-Seq libraries.  in Polyadenylation in Plants – Methods and Protocols, Methods in Molecular Biology, vol. 1255. A. G. Hunt and Q. Q. Li, eds. Springer. ISBN 978-1-4939-2174-4.  pp. 195-208.

11. Buonaccorsi, V., Peterson, M., Lamendella, G., Newman, J., Trun, N., Tobin, T., Aguilar, A., Hunt, A. G., Praul, C., Grove, D., Roney, J., and Roberts, W. (2013).  Vision and change through the Genome Consortium on Active Teaching using Next-Generation Sequencing (GCAT-SEEK).  (Letter to the editor).  CBE-Life Sciences Education 13, 1-2.

12. Chakrabarti, M., Dinkins, R. D., and Hunt, A. G. (2016) De novo transcriptome assembly and dynamic spatial gene expression analysis in red clover (Trifolium pratense). Plant Genome 2016 Jul;9(2). doi: 10.3835/plantgenome2015.06.0048.

13. Chakrabarti M, Dinkins RD, Hunt AG. (2018) Genome-wide atlas of alternative polyadenylation in the forage legume red clover. Sci Rep. 2018 Jul 27;8(1):11379.

Home