• Research

Sequencing the wheat genome

Empowering sustainable agriculture by unlocking the genome of this highly complex food crop.

Project summary.

Led by: Anthony Hall Group

Start date: 1 August 2012

End date: 31 July 2017

Duration: 60 months

Wheat is grown on the largest area of land of any crop at over 225 million hectares. It is also generally regarded as both the most important cereal for direct human consumption and the most significant global source of vegetable protein, with an estimated yearly production of 750 million tons. This makes wheat a vital crop for the populations most exposed to current and anticipated failures in global food security.

Securing food supply on a global scale requires solutions to a complex set of unprecedented problems, including rising demand due to major population increases and social mobility, global climate change, rising energy costs and land, water and nutrient limitations. Finding and implementing these solutions is a top priority for governments and scientists worldwide, and has been articulated as a key BBSRC strategic objective.

This project is a BBSRC strategic longer and larger (sLoLa) grant award that brings together complementary expertise in wheat genetics, genomics and bioinformatics from four UK based institutes: Earlham Institute (EI), John Innes Centre (JIC), European Bioinformatics Institute (EBI) and Rothamsted Research (RRes). Functional genomics research is carried out in collaboration with the University of California Davis.

The five-year research programme is being carried out in three inter-dependent themes and has run from August 1 2012.


Bread wheat has an exceptionally complex genome comprised of three independently maintained genomes, each of which is approximately 6 Gb - more that the entire human genome. Wheat genes are found predominantly as small (1-4) clusters, with an average density of between 1 gene/86kb in proximal regions and 1gene/180 kb in distal regions of the chromosome.

Genes and gene islands are separated by extensive tracts of nested retrotransposon repeats comprising approximately 85% of the genome. The gene content of diploid grasses is approximately 30-35,000 suggesting bread wheat has approximately three times this number of genes. The scale and complexity of this genome requires a large coordinated effort and the development and application of new technologies.

The project currently has three inter-related research themes to:

  • Define complete sequences of all wheat genes and their accurate long-range order in the A, B and D genomes of the reference line Chinese Spring 42
  • Identify and annotate important genome features such as genes and repeats
  • Develop cost-effective sequencing methods for sequencing multiple wheat varieties
  • Generate genomic resources for understanding the functions of wheat genes
  • Create databases and bioinformatics tools to exploit the genome resources for crop improvement and research
  • Establish links with breeders and international projects aimed at securing future supplies of wheat

Access to accurate genome sequence assemblies of wheat varieties and progenitor species will unlock new sources of genetic diversity for breeding and accelerate the production of new varieties. Genome assemblies will also provide key foundations for understanding the complex evolution and domestication of wheat and the functions of wheat genes. The exceptionally large polyploid genome of wheat is a major barrier to genome sequencing and assembly because it is composed of three closely- related and independently maintained genomes, each of which contains very large tracts of repetitive DNA that make each of these three genomes much larger that the human genome.

Our strategy has been to build on past sequencing efforts that have identified most wheat genes and their approximate chromosomal locations, and to exploit and adapt new approaches to whole genome shotgun sequencing as the most rapid and cost-effective sequencing strategy.


A searchable BLAST database of improved wheat genome sequence assemblies

A new Whole Genome Shotgun (WGS) assembly of the Chinese Spring reference wheat genome is now available for analysis on the Grassroots Genomics BLAST server at the Earlham Institute (EI) in Norwich, UK. The new assembly captures over 75% of the 17Gb genome in very large sequence scaffolds.

Visit http://www.wheatgenome.org.uk/Theme1_Outputs.html for further information on the data release.

Technology used.

Work in this Theme has focussed on applying new approaches to whole genome shotgun sequencing to the genome of variety Chinese Spring 42. Our approach has been to integrate innovations in the production of a non-amplified paired-end library and multiple nested long mate-pair libraries, in generating long 250 bp reads for deep coverage of multiple libraries, using rigorously clean data before assembly, and adopting assembly algorithms that preserves the complexity of sequence during assembly.

These innovations have been coupled to multiple approaches to generating long-range links or scaffolds between sequence assemblies. We have also developed efficient high-throughput methods for sequencing large insert BACs and are sequencing these to complement the whole genome assemblies. To develop resources for gene identification and annotation, we are generating sequence of full-length RNAs using PacBio SMRT sequencing and Illumina strand- specific sequencing of RNA.

Impact statement.

The transformative effect of access to a high quality genome sequence that is carefully analyzed, and directly and freely available to all users, is well known. Wheat is one of the three major crop plants of global importance, and the predicted impact of a high quality wheat genome resource on crop improvement will be profound, as genomics provides a framework for new breeding methods that are substantially faster and more effective.

The wheat genome project will have two an immediate, global impact on a wide range of new research in wheat. Systematic study of protein sequence variation, global gene expression, and the systems-level analysis of biological functions will transform research into crop improvement. Consequently, progress towards increasing yield stability and sustainable production will be substantially accelerated.

A key impact will be the direct and permanent improvements in the rate and scope of wheat breeding, leading to the production of new wheat varieties that can maintain high levels of productivity with reduced inputs. Wheat growers will benefit from new varieties that will be more productive and with new end-uses, leading to more stable incomes and diversified production, while research into nutrient- and water-use efficiency could significantly reduce the environmental footprint of growing wheat. In turn, consumers will benefit from more stable prices and access to a staple food.