Memories of sequencing the human genome to mark seven decades of DNA
It’s one of the most recognisable images on the planet - two lines circling each other, joined in the middle by slim rungs.
The elegant spiralling ladder of the DNA helix has not just reshaped the face of biology, it has become a cultural icon - inspiring artists and scientists alike.
April 25 is National DNA Day - a global anniversary celebrating the publication of the paper that demonstrated this double helix structure and recognised its inherent potential to allow the copying of genetic material.
But this year is extra-special.
Not only is it the 70th anniversary of the double helix, but this month also marks the 20th anniversary of the announcement of completion of the Human Genome Project.
This international scientific research project had the goal of identifying, mapping, and sequencing every human gene, both physically and functionally. It remains the world's largest collaborative biological project.
Dr Christine Fosker, Head of the Earlham Institute’s Research Faculty Office, began her scientific career at the Sanger Institute in 1996, where she was among the people working to map out the human genome.
“I have personally finished the sequence of one per cent of the human genome – over 30 mega-bases,” she states.
“I’d graduated and was starting to think about a PhD but I didn’t know what to specialise in. I saw the post at Sanger advertised and thought it sounded like a brilliant opportunity.”
Dr Fosker was employed as a finisher, responsible for polishing the sequence post-assembly.
“We worked on chunks of up to 200,000 base pairs,” she says. “I’d say the average was around 140,000, which the team broke down into smaller chunks of around 2,000 and cloned.
“All the clones were grown up and picked by hand and it would take us around a month to generate the data.”
She says every base had to have double evidence, from two different clones, and ideally from both strands of the DNA.
“We used to mark milestones – for example, a small celebration if a gene was found on a segment you had sequenced. The first one found in the sequence I finished was the gene for Christmas disease (haemophilia B).”
Most frequently the finishers would find a gap in the read and would have to find a way of deducing the missing bases. Multiple smaller clones spanning the region would show what could be expected to be there, so they would design bespoke primers about 100 bases back from the gap and synthesise the missing bases.
“Sometimes you’d be lucky and it would just read straight across; on other occasions, there would be a reason for the original fail - for example, a hairpin bend in the DNA - and we would have to find a way of solving that problem.
“It was like doing a jigsaw with no picture. I really enjoyed solving those puzzles and the complexity of problems.”
Sequencing the human genome just decades after the structure of DNA had been published required major scientific and technical advances. And these efforts have continued apace during the past 20 years.
The Earlham Institute - originally called The Genome Analysis Centre - was opened in July 2009 as a national centre to further the UK’s capacity in genomics. Our very existence is testament to decades of energy and innovation inspired by DNA.
Reflecting on changes over the past 20 years, Dr Fosker says the biggest surprise is the shift in speed and cost.
“The scale of the work you can do now is extraordinary,” she explains. “At the time, we were dreaming of personalised genomics and sequencing cancer cells – ideas which could become reality, and did become reality.
“I think what we couldn’t imagine was how affordable and quick it would become. The concept of the 1000 dollar human genome seemed a laughable impossible goal.”
“With the technology platforms we have here at the Earlham Institute, we can now sequence an entire genome in a few hours for a few hundred pounds. In 1996 we would never have imagined that to be remotely possible.”
Whole genome sequencing is now available through the NHS, used for informing diagnosis and treatment for a multitude of conditions.
It was a pivotal part of the world’s response to Covid-19 - the genome was sequenced within weeks of its first appearance, giving scientists the chance to identify vaccine targets and track spread. Scientists use whole genome sequences to identify DNA patterns associated with genetic traits or disease.
All the work done at the Earlham Institute builds on the discovery of the structure of DNA. And we are part of international projects which match the Human Genome Project in ambition and scale.
We use our expertise in single-cell genomics, bioinformatics, cutting edge platforms, and data management skills to contribute to the global Earth Biogenome Project (EBP), launched in 2018, which plans to provide a complete DNA sequence catalogue of all eukaryotic life - every animal, plant, fungus and protist - on Earth.
The UK arm of the EBP project - the Darwin Tree of Life project - involves partners spread across the UK, including the Earlham Institute. The project’s ambition is to sequence the genomes of the estimated 70,000 species found in Britain and Ireland - generating an open-source catalogue of data.
The Earlham Institute is also leading strategic programmes of research to generate new science and discoveries from large genome collections, as well as exploring genomic variation as a normal phenomenon in healthy systems.
From tackling the rise of antimicrobial resistance to generating genomic resources that help to improve food security, the diverse and exciting portfolio of science at the Institute has potential to match the discoveries we are celebrating today.
Dr Fosker says: “Everyone I worked with at Sanger was incredibly invested in the Human Genome Project, and I made life-long friendships.
“I was lucky to be there, it is a great memory for me, and I’m now lucky enough to be at the Earlham Institute with similarly dedicated people who are again working on projects that are set to change the face of science.”