August 12, 2014

Review: Paired-End Analysis of Transcription Start Sites in Arabidopsis Reveals Plant-Specific Promoter Signatures

The Plant Cell tpc.114.125617  

 (click here to access original article)






Taj Mortona, Jalean Petrickab,c,d, David L. Corcoranb, Song Lib, Cara M. Winterb,c, Alexa Cardab, Philip N. Benfeyb,c, Uwe Ohlerb,e,f,g and Molly Megrawa,b,h,i,*

* Corresponding author: megrawm@science.oregonstate.edu
aDepartment of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 97331
bInstitute for Genome Sciences and Policy, Duke University, Durham, North Carolina 27708
cDepartment of Biology, HHMI and Center for Systems Biology, Duke University, Durham, North Carolina 27708
dDepartment of Biology, Carleton College, Northfield, Minnesota 55057
eDepartment of Computer Science, Duke University, 308 Research Drive, Durham, North Carolina 27708
fDepartment of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, 27710
gBerlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 13125 Berlin, Germany
hDepartment of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331
iCenter for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon 97331 

The problem of identifying the transcriptional start site (TSS) of a gene is a notoriousluy difficult problem in promoter analysis. In plants, the genetic features which govern where transcriptional initiation will take place have largely been assumed to resemble those identified in more thoroughly studied animal systems. However, a report this week from Molly Megraw's group at Oregon State University (in collaboration with Duke University and Carleton College) describes a large-scale analysis of TSSs using the recently developed technique of paired-end analysis of transcriptional start sites (PEAT). This work alters our view of transcriptional initiation in plants by demonstrating that, in contrast to animal models, most plant promoters are devoid of the TATA box that is typical of TSSs. Instead, transcriptional initiation in plants depends on a large collection of known sequence binding elements.

Previous methods for determining TSSs included a straightforward comparison of ESTs compiled from massive sequence collections. 5'RACE has also typically been employed to determine start sites. However, these techniques are limited by their low throughput nature and reliance on manual production of data on a gene-by-gene basis. 5'RACE is also a finicky technique that lacks reproducibility and so is prone to produce artefacts. A more reliable technique for estimating the true 5' end of transcripts involves the technique of primer extension, one of the more difficult molecular biology techniques to master pertaining to the "old school" skill set.

Morton et al. used paired-ends analysis to generate millions of TSSs from Arabidopsis thaliana root samples. They then analyzed these data using a machine learning model which identified TSS tag clusters with great sensitivity and accuracy. This led then to the analysis of transcription binding sites of promoters showing initiation patterns. Based on these analyses, the authors reached the rather surprising conclusion that TSSs of plants are largely devoid of the canonical TATA box. This work extends our knowledge of transcriptal initiation in plants and provides a tool set for the identification and prediction of TSSs directly from sequence. Having relied personally on 5'RACE and primer extension for years, I extend my gratitude to the authors for making these frustratingly tedious techniques no longer necessary, or at least for providing a reliable alternative.

No comments:

Post a Comment