Impact story embracing innovation through technology

With the advent of next-generation sequencing (NGS), life sciences has been ushered into an era where genome sequencing spans all domains of life. Now, rather than struggling to generate genomic data, the challenges lie in storing, assembling and analysing it - turning data into knowledge. We harness and develop new technologies, whilst encouraging a collaborative, open-source culture to drive research into computing, genomics and biological sciences.

Challenges in abundance.

It once took thirteen years and three billion dollars to sequence the human genome. Now, for a fraction of this amount, we can do the same for a wheat genome, five times larger and vastly more complex than our own, in just under three weeks.

However, the complexities associated with putting the pieces of the jigsaw together are still immensely challenging. The wheat genome is full of repeated elements, which confound any attempt to put them in a logical order. Composed of three ancestral organisms; it’s like having 3 parents, from 3 different species, so sometimes it’s just too difficult to identify which bit of DNA belongs to which ancestor.

This isn’t restricted to wheat. Many of the genetic elements that we want to identify in order to select for disease resistance, for example, are highly conserved, therefore picking these apart presents a similar challenge. Then there is non-coding DNA, the stuff between the genes, which we now know can also have an evolutionarily-relevant function. So how do we identify meaning amidst the noise?

There is also the matter of the sheer amount of data that needs to be both stored and analysed. The data generated by DNA sequencing laboratories worldwide requires huge numbers of calculations to process data to information, with annual data growth predicted to increase beyond 4000% by 2020. We have supercomputers that can process millions of times more data than the average laptop, yet even this capacity is being overwhelmed, while powering and maintaining these systems presents an increasing problem in terms of energy and cost efficiency.

In order to respond to these significant challenges, we are at the forefront in taking up and developing the latest technologies, from DNA sequencing equipment to novel supercomputers, while fostering open science through international collaborations to enable the advancement of genome analysis and life sciences research.

Embracing new technologies.

We build systems at scale to amass data and have a vested interested in providing solutions to process it. Our research interests cover a vast range of organisms, many of them non-model species, presenting novel challenges as we often work with little or no reference data. To answer this, we have developed laboratory pipelines, open-source software and built huge IT infrastructures to allow non-model organisms to be quickly brought up to speed with a high-quality reference genome. We foster collaborative science through exposing our platforms and enabling open-access data sharing.

Technology in our field evolves quickly - demanding that we embrace novel equipment and platforms. The high standards with which we process and sequence samples using a combination of the most modern platforms is essential for high-quality data analysis. The data we generate has provided much improved sequences for organisms from wheat to koalas. The software tools which we develop alongside our platforms to deal with this data allows us to fully characterise an organism genetically, from understanding the very building blocks of its DNA, to what genes are turned on during a period of adaptation or stress.

Our synthetic biology lab allows us to stay at the cusp of biological sciences research. The latest advances in synthetic biology are underpinning many potential uses for the improvement of human health, as well as the accuracy and effectiveness of crop improvement strategies. It could be that ‘writing’ DNA through synthetic biology is one day as transformative as ‘reading’ DNA through sequencing.

Similarly, we are committed to enhancing and ushering in a new era in computational science. Through helping to develop and test the latest supercomputing technology, including optical and quantum computers, we are enabling the scientific community to store and process ever larger biological datasets - using a fraction of the power and time required using conventional computing systems.

Our leading national and international capabilities allow us to be at the forefront of genomics research, enabling us to impact science on a global scale.

Open data.

A determined commitment to open science, open access and open data allows us to have a significant impact.

Through building on research and adding our research expertise we have been able to extend and enhance already cutting-edge software, to produce the most accurate and complete wheat genome assembly to date. Now available to all researchers, this will allow accelerated crop improvement of one of the world’s most important staples.

Another important tool is the Brassica Information Portal (BIP). This online resource is of importance to scientists, breeders and industry, as a reference database linking phenotypic traits to specific cultivars of all Brassica species. As one of the most globally important genera of crops for improved nutrition, with India and China being the biggest producers of vegetables such as cauliflower, more efficient breeding of Brassicas using BIP will have positive effects worldwide.

Other tools developed on-site allow for easier identification of introns within genome sequences and better comparison of genomic datasets. Our help with the community-driven development of the Oxford Nanopore MinION, including development of the alignment and analysis tool NanoOK, is ushering in the next wave of NGS technology.

Finally, our participation in international collaborations including CyVerse, Collaborative Plant Omics and the Wheat Information System, highlights our open-data, open-access and open-source culture.

Embracing innovation through technology.

Challenges in abundance.

Embracing new technologies.

Open data.

Our research.