Sat nav for bread wheat uncovers hidden genes

18 April 2017

Scientists have created the most accurate navigation system for the bread wheat genome to date – allowing academics and breeders to analyse its genes more easily than ever before.

Wheat is one of the world’s most important staple cereals but is also the most complex. Three sub-genomes together contain around five times more DNA than the human genome. Nearly 80% of this genetic material is repetitive, making it even harder to sequence and analyse. 

Now, harnessing advanced sequencing technology and computational approaches, scientists from the Earlham Institute, with colleagues at the John Innes Centre, have published the world’s most complete picture of the wheat genome. It includes the location and detailed annotation of over 100,000 wheat genes. More than a fifth (22%) of these were completely absent from earlier assemblies, or found only as fragments.

“We applied the latest sequencing and bioinformatic techniques we have developed at our institute to the huge and complex wheat genome. We were able to achieve the best results anyone had seen, including uncovering previously hidden genes,” says senior author Matthew Clark, Head of Technology Development at the Earlham Institute (EI).

“Moreover, all our methods are open, and available for anyone to use. This is critical as wheat DNA varies across the world, which is key to its success in different environments. We have already started to sequence many varieties of UK wheat using these methods, and we hope others will sequence the genomes of wheat important in their country,” he says.   

The results, published in Genome Research, focus on the variety called Chinese Spring - the standard cultivar for genomic research. The genome and annotation have been accessed more than any other resource on the genomic portal Plant Ensembl , where they have been available for a over year for thousands of researchers and breeders to use. The project was funded by Biotechnology and Biological Sciences Research council grants to EI, John Innes Centre (JIC), European Bioinformatics Institute and Rothamsted Research, with contributions from international partners at the PSGB (Munich, Germany) and University Of Western Australia. 

The improved genome assembly combined with high quality sequencing data and novel methods allowed EI scientists to more accurately identify genes and areas of the genome with interesting functions. In previous assemblies, many genes were missing or found only as fragments. By identifying the entire DNA sequences of genes, EI scientists have made it possible to identify more complete  sets of similar genes - called gene families - that are important for yield, disease resistance or other qualities important for agriculture. 

EI scientists have already used the advances to explore UK varieties and they have released six wheat genomes on the EI’s open data website Grassroots Genomics. They and scientists from the John Innes Centre and The Sainsbury Laboratory have also started to use the results to provide a more accurate picture of where to find disease resistance genes and genes important for the visco-elastic properties of bread - which make it soft and spongy.

More than two billion people worldwide rely on wheat as a staple food, making it a vital crop for global food security. However, yield increases have stagnated since the mid-1990s. A better map of the wheat genome is essential for breaking the deadlock. It will help reveal the location of important traits that can be bred into elite varieties. 

Lead author Bernardo Clavijo from the Earlham Institute says: “Scientists all over the world are already using these new results. But even more importantly, our open methods allow a new level of accuracy for any wheat line, and many other complex genomes. Assembly for this complexity of genome has always been a bit of a one-off work of art. Now we have a way to do it reliably and to a standard that enables thorough analysis.”

“We are moving towards a scenario where more and more wheat lines will be sequenced and compared using these and similar techniques. This kind of detail on every wheat line will enable new discoveries and accelerate breeding. We are already working with the breeding industry as well as other researchers to enable more detailed analysis of elite varieties, which will impact the wheat breeding programs directly.”

Ksenia Krasileva, a co-author on the new study, likens the creation of an assembly to navigating using GPS: “Breeders might know there is something really useful in wheat, for example for protecting crops against disease or for improving gluten for bread-making, but without a good quality genome assembly it’s like driving through thick fog. Full genome assembly and annotating genes provides a sat nav view of wheat genes to signpost the way to useful genes in all varieties of the species.”

EI group leader David Swarbreck says: “This is the most comprehensive wheat gene annotation to date, it represents a significant advance that will assist wheat breeders and researchers in accelerating further improvements, particularly as the results are freely available for anyone to use.”

Co-author Michael Bevan from the John Innes Centre says: “The new resources we have helped develop have already broken down barriers and are providing new ways of studying wheat. They will allow breeders to more accurately predict which lines to breed from, and to directly identify the most promising progeny. This could save years when making new varieties.”

Notes to editors.

Notes to editors

Notes to Editors

Reference to paper:

Clavijo, B.J. et al. (2017). An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. Genome Research.

For more information, please contact:

Zoe Dunford

  • +44 (0)7786 303597

About Earlham Institute

The Earlham Institute (EI) is a world-leading research institute focusing on the development of genomics and computational biology. EI is based within the Norwich Research Park and is one of eight institutes that receive strategic funding from Biotechnology and Biological Science Research Council (BBSRC) - £6.45M in 2015/2016 - as well as support from other research funders. EI operates a National Capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.

EI offers a state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute is a UK hub for innovative bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users and promoting applications of computational Bioscience. Additionally, the Institute offers a training programme through courses and workshops, and an outreach programme targeting key stakeholders, and wider public audiences through dialogue and science communication activities.