Article Science

What’s the power of a pangenome?

Here at the Earlham Institute a lot of people are very excited about pangenomes. But what are they? And how will they change genetics?

28 May 2025

Researchers are moving away from a single point of reference to compare multiple genomes against each other. The result is unparalleled resolution on genetic diversity - promising new discoveries in evolution, agriculture, and human health.

What is a genome?

A genome is a complete set of genetic material.

Sections of DNA which carry instructions to the cells are called genes. Genes produce proteins which have a direct effect in the body. However, they also include non-coding sequences - areas of DNA which don’t directly have an effect, but perform indirect functions such as regulating genes and their expression.

The complete set of genes of one individual organism is what we call a genome. Studying genomes holds the answers to how we can understand, benefit from and protect life on Earth. 

Okay…so what is a pangenome?

A pangenome is a collection of genomes. To study genomes, scientists undertake a process called DNA sequencing - this allows them to identify genes and determine functionality and variation across the genome.

For many years, sequencing DNA was time-consuming, expensive, and labour-intensive.

For example, the international effort to sequence the entire human genome - the Human Genome Project - began in 1990 and took 13 years to complete.

This remains one of the most ambitious scientific collaborations in history - involving researchers from the United States, United Kingdom - including some now working at EI - France, Germany, Japan, and China and costing an estimated £2 billion. 

Less than two decades later, everything has changed.

Today, advances in technology and refinement of technique mean sequencing has taken off at an unprecedented level. In 2025, high-quality sequencing of a complete human genome takes about a day and costs approximately £750.

This drop in cost - and rise in speed and quality – doesn’t just apply to human DNA. Sequencing of any description is faster and cheaper, opening up extraordinary possibilities for researchers.

Digital illustration of a double helix with bright colours representing the DNA bases

Why is a pangenome useful?

When sequencing was time-consuming and costly, scientists had no option but to rely on a single high-quality and well-understood reference genome for each species. All DNA samples from that species would be compared to the reference.

However – this approach has pitfalls.

Genetic differences exist in all species - humans, animals, fungi, bacteria, and plants alike. They influence traits throughout the genome, including appearance (eye colour, skin colour, leaf variegation, fur colour), disease resistance, growth, size, lifespan.

Variation means a single reference genome cannot represent all the diversity within a species.

It could - in fact - not be representative at all. If the single individual sampled for the reference carries a rare gene variant, then scientists using that reference are working to a “standard” which is anything but.

To return to the Human Genome Project, the sequenced genome came from multiple anonymous donors. While identities were kept confidential, we know that more than 70 per cent of the sequence came from one man in Buffalo, New York. The rest came from 19 other individuals, mostly of European descent.

Finishing the sequence was a huge achievement - but not the end of the work. With a global population of more than 8 billion people, spread over 195 countries, that first sequenced genome does not even come close to capturing the variation in our species.

But a world where we can produce high-quality sequences at a low cost, in a short time, allows us to begin stepping away from the bias of the single reference genome.

Crowd of People

So what does pangenomics mean for biologists?

Scientists now have the capacity to sequence complete genomes from multiple individuals to form a pangenome and compare them to each other.

This can help researchers identify which regions are shared, which are unique, where variation is and how frequent it is.

Basically, a pangenome captures a far more detailed picture of a species' genetic diversity. It offers biologists a powerful tool to understand variation, evolution, disease, and much more.

It also offers a spectacular opportunity to re-examine existing knowledge. DNA from previously analysed populations can be mined for variation missed the first time.

The next step for scientists following the Human Genome Project is the Human Pangenome Project, a global effort to map genomes and variation from all over the world. A draft human pangenome reference has already been produced.

How is the Earlham Institute using pangenomes?

We are developing tools to build and analyse pangenomes as part of the Institute’s Decoding Biodiversity strategic research programme. Part of this work includes applying pangenomics to existing and new datasets to uncover new genetic information.

Pangenomes present a large amount of data, which can be refined or manipulated in any number of ways. Because they are so flexible, there are many different methods for analysis. 

The pangenome can be created as gene-oriented – modelling the presence or absence of genes within a population – or sequence-oriented, focusing on smaller details like single-nucleotide variants, insertions, and deletions.

Different strategies and tools can be used to look at various levels of detail, or organise the pangenome to see particular kinds of information. This could include a sequence-oriented pangenome graph which is constructed to show where genetic differences occur within a species. 

The pangenome can also be annotated - the process of identifying and describing its functional elements. Genes and other functional regions can be marked, and it can identify variation, predict where genes start and end, and spot non-coding regions. 

Pangenomes are exciting opportunities for both scientific discovery and practical application. They allow researchers to identify genetic variants, enhance understanding of evolutionary processes, and advance crop and livestock breeding. In medicine, pangenomes could contribute to development of targeted diagnostics and personalised treatments.

By assembling data from multiple individuals, pangenome construction allows researchers to gain deeper insights into the complexity of genetic architecture and transform genomics by capturing the full spectrum of diversity within a species.

This is an introduction to pangenome research at the Earlham Institute. In later articles, we will be going into more detail, focusing on the work our scientists are doing in this exciting area. 

Image
Profile of Amy Lyall
Article author

Amy Lyall

Scientific Communications and Outreach Officer