• Article
  • Science

Long non-coding RNAmazing: X marks the spot

April 15, 2016

We used to think that only 1% of the human genome had any use. Oh, how times change.

The more we delve into the treasure trove of genetic information present in diverse genomes, from humans and worms to Arabidopsis and wheat, the more we have come to appreciate the bounty of non-coding elements that lie within.

Among the buried treasure – that composes just a small percentage of a genome – lie long non-coding RNAs (lncRNA), some of which, far from being junk, can provide insights into important regulatory mechanisms across all domains of life.

Treasure trove

It’s not all about the genes. Granted, they’re important – and they provide the basic blueprint by which our bodies are assembled and maintained. However, it has become increasingly apparent that there is more than meets the eye when it comes to development and regulation.

With the dawn of faster and more accurate genome sequencing techniques has come a wealth of extra information. One such surprise was the fact that, along with the transcriptome (all messenger RNA) – the expressed genetic material within DNA – there comes with it plenty of extras.

Open quote marks

These RNAs are incredibly important in regulating many processes and are vital in determining cell fate – so that, from the same set of instructions, we get eyes and teeth, livers and spleens.

Closing quote marks

We now know that transcription does not stop with simply the genetic elements that code for a protein, but carries on with mounds of extra material which can’t so easily be ascribed to a certain trait. We might match a gene to a phenotype, such as blue eyes in humans, or bigger grain in wheat, but this isn’t so easy with material which, on the face of it, means absolutely nothing.

There is so much more to the story of DNA > RNA > Protein – the favourite of high school biology lessons – and a whole treasure trove to reveal.

RNA – shooting the messenger

RNA is essentially a single-stranded version of DNA that has an extra OH rather than an H attached. This is an important difference. It means that, inside a living organism, DNA and RNA can be distinguished.

RNA has two major functions which many people already know about. For one, it is used as “messenger RNA” (mRNA), which is essentially a copy of a gene to be turned into a protein via transfer RNA (tRNA) at a ribosome (which are also made partly of RNA). At the ribosome, tRNA is read three base pairs at a time, which attracts the correct amino acids to make the proper protein – from keratin in our nails to pepsin in our digestive system.

However, there’s more. We know, for example, about small RNA (sRNA) – such as small interfering RNA (siRNA) and microRNA (miRNA) – which, instead of aiding protein manufacture in the cell, are responsible for mainly hunting down mRNA and destroying it.

These RNAs are incredibly important in regulating many processes and are vital in determining cell fate – so that, from the same set of instructions, we get eyes and teeth, livers and spleens. In plants, sRNAs have been shown to control a suite of processes and are linked with defense against viral and bacterial pathogens.

RNA’s could come in handy protecting against viral and bacterial pathogens. Photo: Shutterstock / ralwel

However, there’s more. We know, for example, about small RNA (sRNA) – such as small interfering RNA (siRNA) and microRNA (miRNA) – which, instead of aiding protein manufacture in the cell, are responsible for mainly hunting down mRNA and destroying it.

These RNAs are incredibly important in regulating many processes and are vital in determining cell fate – so that, from the same set of instructions, we get eyes and teeth, livers and spleens. In plants, sRNAs have been shown to control a suite of processes and are linked with defense against viral and bacterial pathogens.

Noise or function?

But what about longer bits of non-coding RNA, the lncRNAs? Well, in the 1990s, they began to stumble slowly into scientific awareness. One such lncRNA – HOTAIR – has been shown to act in the regulation of human development, through repressing Hox genes – the instructions determining the patterning of cells and tissues. This same lncRNA has been implicated in cancer – with an overproduction of HOTAIR perhaps causing increased tumor invasiveness.

Open quote marks

… in plants, more is being revealed about the role of lncRNAs in relation to various stresses, such as pathogen infection and drought.

Closing quote marks

The experimental characterisation of about a hundred lncRNAs has revealed their role in regulating gene expression through a wide range of mechanisms.

Perhaps the most well-defined function of a lncRNA, the X (inactive)-specific transcript (Xist), lies in X chromosome de-activation in female mammals.

In humans, the sex chromosomes XX and XY present an imbalance of genes, with males carrying one less copy of certain genetic elements on the X chromosome. To address this, mammalian cells pull off a trick known as “dosage compensation,” which uses Xist to inactivate one of the X chromosomes, either paternal or maternal, at random. Thus, males and females have equal expression of X-linked genes.

LncRNAs are not limited to mammalian cells – but have been identified in diverse organisms, from various vertebrates to nematode worms and Arabidopsis thaliana – the classic model organism for research into plant genetics. In fact, in plants, more is being revealed about the role of lncRNAs in relation to various stresses, such as pathogen infection and drought.

A particularly interesting role for lncRNAs in plants is in the process of vernalisation – the control of flowering time in plants that undergo cold winters. A protein called Flowering Locus C (FLC) is responsible for preventing flowering in such plants by blocking genes that promote flower growth.

Winter-flowering heather. Photo: Shutterstock / Roxana Bashyrova

Heather

For flowering to occur, FLC must be repressed after a spell of cold weather. Interestingly, it appears that two lncRNAs, called COOLAIR and COLDAIR, both of which increase in abundance during cold treatment, are partly responsible for this activity. These lncRNAs are themselves derived from non protein-coding regions of the FLC gene.

Clearly, these genetic elements initially conceived of as being mere noise are, rather, incredibly handy – and vital in development throughout eukaryotes.

A bounty of lncRNAs

It’s all well and good knowing about certain lncRNAs and their function – but how do we find the useful ones amongst tens of thousands of redundant or functionally insignificant pieces of nucleic acid?

Using the human genome as an example, we know that around 2% of all of the genetic information within us codes for proteins. We also now know that another 8% of the genome undergoes “selection pressure” i.e. it is important in survival and propagation.

These conserved non-coding sequences scattered across the genome are of extreme importance to an organism’s biology, as they harbour the regulatory elements for protein coding genes – dictating when and where a gene should be expressed.

Open quote marks

… It’s like looking for a metal needle in a haystack – we know enough that we can essentially find this with a well-guided magnet.

Closing quote marks

In addition, recent studies found that mutations within these conserved non-coding sequences are associated with diseases and genetic disorders stressing even more the importance of studying them.

However, if searching for the coding parts, the genes, is like looking for a metal needle in a haystack – we know enough that we can essentially and relatively easily find this with a well-guided magnet. The extra 8% is like finding a wooden toothpick.

Looking at the expression of lncRNAs in fruit fly populations may give us further insight into their function. Photo: Shutterstock / Thithawat-S-compressor

Fly

A good indicator that a sequence might be functional is its preservation across species. This has been successfully applied to protein-coding sequences, for instance. However, the vast majority of lncRNAs tend to show little conservation across species – making them even more difficult to study.

One approach is to interrogate their expression across a population, prioritising those that are reliably found in multiple individuals.

This is being successfully applied to many organisms such as humans, fruit flies and worms. As expected, lncRNAs present in many individuals strongly differ in their conservation, nucleotide composition and expression compared to those that are seldom found and that most likely reflect transcriptional noise.

We are now in the process of experimentally seeking the biological function of those robustly expressed lncRNAs through genetic engineering in model organisms.

Plant pirates

The recent release of the bread wheat genome assembly, in conjunction with the wealth of data generated for the completion of this flagship project at EI, now allows us to begin the search for functional non-coding RNAs in such an important crop plant.

The challenge is much greater than the one encountered in other species such as humans. While the human genome is composed of about three billion nucleotides, the wheat genome includes about 17 billion bases of which at least 80% to 90% is composed of repetitive elements.

Wheat has a highly complex multiploidy genome. Photo: Shutterstock/VAV

Ear of wheat

The identification and characterisation of lncRNAs in bread wheat and their comparison to similar genes in other related crop species (already or currently being sequenced) will allow us to answer fundamental questions with respect to lncRNA sequence evolution; their regulatory network and the impact of polyploidization during wheat domestication. Most importantly, identifying the mechanisms of action of lncRNAs, their regulatory targets, and the impact of mutations within these loci in wheat’s biology will allow us to better understand the governing network’s underlying traits and environmental adaptation in wheat.

It is a daunting but incredibly exciting task to undertake, as finding and characterising elements that affect traits of interest such as growth or resistance to pathogens could have major impacts due to the importance of crops in our everyday life.

Identifying the mechanism of action of lncRNAs and their regulatory targets will allow us to better understand how plants respond to their environment, which is especially important when faced with a rapidly changing climate and more severe environmental pressures.

By Peter Bickerton & Wilfried Haerty

Article author

Peter Bickerton

Scientific Communications & Outreach Manager