These are exciting times to be an RNA biologist. Next generation sequencing revolutionized genetics, but now the RNA methodologies have caught up. For every DNA technique, we have developed an equivalent RNA method and then some. For example, there is CLIP-seq and Par-CLIP replacing ChIP-seq in RNA studies but then there is also recently developed high-throughput methods for probing the secondary structure of RNA in vivo (Roushkin et al. 2013, Nature). Last year the first ever large scale binding information for a compendium of RNA-binding proteins (RBPs) was published (Ray et al, 2013, Nature). The computational methods are also gaining, from SeqFold (Ouyang et al, 2013, Genome res) to our TEISER (Goodarzi et al, 2012, Nature). Did I mention these are exciting times?!
It is in light of these advances that making sense of the underlying post-transcriptional regulatory networks that control different aspects of RNA life-cycle and behavior has become ever more important. Five years ago, we embarked on a path to catalog the sequences in RNA that play substantial regulatory roles, by providing linear or structural information for trans factors to recognize and act on. Given the state of technology at the time, we were limited by the diversity of the library we could generate. So, we decided to focus on 3′ UTR sequences that are conserved across vertebrates. We synthesized these sequences in short spans on a custom-designed Agilent array and cloned them downstream of mCherry in a bidirectional promoter which also drives the expression of GFP as an endogenous control. Our goal was to then use FACS to choose the sub-populations that show higher/lower relative expression of mCherry. We could then amplify the cloning site in the selected populations and re-hybridize them back to our Agilent array for quantification (Figure below). It was all good on paper, but as is always the case, we ran into myriad technical problems, ranging from generating a library with enough independent cells (high coverage) to reproducible FACS measurements. By the time we were done trouble-shooting these problems, a lot had changed in the field. For example, sequencing had really become the staple of RNA biology (which we decided to use instead of array hybridization for quantification purposes), Agilent had started to provide custom oligo libraries directly to consumers (which means that this approach can easily be implemented in every lab) and more importantly, FlpIn system (Invitrogen) appeared that significantly affected the reproducibility of our measurements (since all clones in the library are inserted in a unique site in the genome). As is always the case with method developments, we needed to perform innumerable validation assays to evaluate the efficacy of our approach in finding known and novel regulatory elements. Our findings were published last week in Cell reports (Oikonomou et al, 2014) which I encourage you to read. Interestingly, David Erle’s group also published a similar approach which beat our paper by a few days (Zhao et al, 2014, Nature biotech).
These reporter based approaches, insulate each element and studies their effect in isolation; however, real transcripts carry many elements and the fate of the RNA is decided as a cumulative consequence of all the interacting factors. Knowing the initial building blocks, however, enable us to then construct networks and modules of regulatory elements that likely interact and function in an overlapping space (which we tried to infer in our paper using our information-theoretic tools).
Systematic dissection of conserved 3′ UTR sequences in endogenous transcripts
In the end, I wanted to mention that the downside to all the current attention in the RNA field seems to be a fast-paced publication cycle which results in mostly descriptive papers. There is nothing wrong with descriptive studies per se, but sometimes the downstream or underlying mechanisms are so very very much missing. I think, we are also guilty of this to some extent. Our goal was really to identify novel trans factors that interact with the elements we identified using our approach. This is something we are still trying to do and hopefully will manage to better functionally annotate the cis elements and the molecular mechanisms through which they exert their regulatory roles.