When I started my postdoc position at the Fred Hutchinson Cancer Research Center in 2004, I had the great opportunity to identify an antisense transcript at the FMR1 CGG repeat region (HMG, 2007). Better yet, I was able to interact with Diane Cho as she finished her story on antisense transcripts at the DMPK locus (Mol Cell, 2005). The only other group actively looking at antisense transcripts was the Ranum lab, who was in Minnesota at the time (Nat Gen, 2006). We had so much to do as we navigated the skeptical mRNA-centric world. As a consequence, we had to characterize the antisense transcripts using the tools and questions that we would use to analyze coding transcripts. Were the antisense transcripts polyadenylated? Capped? In the nucleus or cytoplasm? Spliced or unspliced? RNA pol II or III? There were so many preconceived notions about the extra transcription exhibited by a gene locus that we had to overcome with careful characterization. Ultimately, we defined the appropriate way to characterize antisense transcripts at repeat regions.
But, publications and grant applications were not easy to obtain. Most reviewers hated the notion of antisense transcript. A major assumption was that the transcripts we identified were "spurious", a term that I have come to realize is tossed around to mean: NOT REAL. My other favorite critique was the statement "you do not know if the antisense transcript causes disease". All of the genes we characterized had a repeat expansion associated with disease. I find it incredibly short sighted to think that the antisense transcript somehow was independent of the sense "disease-causing" transcript when both harbored an expansion from the same region.
A representation of strand-specific interrogation of transcripts at a CAG repeat region
As a consequence, it is difficult to publish noncoding transcriptional activity at a locus. Basically, the transcriptome is an uncharacterized frontier, largely due to lack of interest by genomics field and lack of research funding for the transcriptional regulation field. The tenuous relationship between the transcriptomics field and what seems like the rest of science is best depicted by the public distain for the ENCODE Project. I would rather not spend time on discussing the merits of either side of the argument, rather I want to focus on what I have seen, touched, sequenced, characterized, and understood about bidirectional transcription at unstable tandem repeat loc.
To date, only a few gene regions have been characterized well enough to understand the consequence of a repeat expansion. This includes the FMR1, SCA8, DMPK, FRDA, SCA7, HTT, loci, plus a few more. However, there are ~30 known unstable tandem repeat regions associated with neurodegenerative, developmental disorders, and muscular dystrophy. The repeats are tri, tetra, penta, hexa, and dodeca repeat regions found near promoters, introns, and exonic regions of gene loci. A quick glance of any human gene of interest on the UCSC browser reveals higher transcriptional activity near repeat regions as demonstrated by deposited ESTs. Note, this activity is in both directions. Comparison of the human genome and other organisms, such as mouse, reveal that in many cases, there are overlapping genes at the repeat regions. A huge question is whether the antisense transcripts we have identified are actually remnants of genes no longer in play due to the formation of a true repeat region or whether the antisense transcripts represent regulatory elements, not yet fully appreciated.
Ideally, I would like to continue to explore bidirectional transcription with the goal to provide a clearer picture of the transcriptional activity at tandem repeat gene loci. This is primarily due to the number of diseases associated with repeat expansions. When a repeat region expands, additional and potentially toxic RNAs are generated. These transcripts could also be utilized as locus-specific biomarkers for the associated disease pathogenesis. With an increased focus on therapeutic agents, such as ASOs that target expanded sense transcripts for degradation, it is imperative that we understand what transcripts are generated from expanded disease-associated alleles.
Unfortunately, as a senior scientist in some one else's lab, I did not have my own funding. The years of work to break down the barriers to even recognize antisense transcripts are perhaps wasted for my career, as I could not hang on long enough to acquire funding. However, there is some light at the end of the tunnel. The Amyotrophic Lateral Sclerosis (ALS), community seems to have embraced bidirectional transcription, as well as another controversial topic, Repeat-associated Non-ATG (RAN) translation. RAN translation was first described by Laura Ranum, another pioneer in bidirectional transcription at repeat regions. In fact, the ALS community is incredibly competitive about the sense, antisense, coding, non-coding, toxic, non-toxic, RNAs and transcripts and the C9ORF72 gene. It is a thrill to read each paper, each perspective, and each interpretation. There seems to be little room of critique of antisense transcripts in their world, instead they seem to be focused on using every bit of evidence they can find to characterize the molecular mechanisms underlying this disease.
Make no mistake, I do not support some of the underhanded competition that has occurred in this field. Even I have been taken advantage by a few members of the C9ORF72 community, but I stand firm that the outcome will benefit the patient community, so it is worth it. I have faith that the researchers are close to having viable and realistic therapeutics for ALS, something that is truly needed. But I chuckle at how easily authors have claimed to "demonstrate" bidirectional transcription at C9ORF72, such as the Petrucelli Lab, where the evidence is basically in vitro analysis of transfected constructs or Edbauer Lab, where evidence of antisense translation is the proof. This is liberating for the field, where scientists with expertise in disease characterization do not have to spend 3-5 years characterizing the transcript boundaries, localization, and processing prior to addressing impact on disease.
There is time and space for researchers such as myself, who can more quickly address the RNAs generated at the locus. I hope the recent body of work at the C9ORF72 locus will make it easier to address bidirectional transcription at the remaining 20+ unstable tandem repeat genes, all associated with equally devastating disease. Nonetheless, I wish that I could have a conversation with the reviewers of my grants, papers, and research proposals and ask them why they rejected the validity of bidirectional transcription. What did they gain by blocking progress? There was never a consequence for them, instead the ultimate burden is on the individuals who harbor repeat expansions and suffer from disease. These individuals continue to wait for someone to be interested in their gene mutation. It pains me that I came so close to being able to advance understanding of the gene loci associated with 20+ unstable repeat regions, but the road block of grant review did not see these diseases as important.