How did genome sequencing change so rapidly?

From sloshing around in radioactive material to high throughput advanced genomics, Dr Daniel Swan tells us about the changing face of genome sequencing

For those of us who have been in genomics for a while, the face of sequencing has changed beyond recognition.

During my PhD, I spent a good proportion of my student days in the ‘hot lab’, a room dedicated to radioisotopes that dominated genomic investigation.

Dealing with Radioactive Southern blots₁ to identify DNA fractions for size selection, radioactive probing of lambda phage libraries to identify genomic clones of interest, followed by 33P Sanger sequencing₂ on polyacrylamide gels, uncovering novel sequences as you went (being precariously short of fully sequenced genomes at the time).

Not hot.

Even as I came to the end of my PhD lab work in the late 1990’s, the way we generated sequence data was already changing.

I was outsourcing publication quality work to be run on LI-COR 4200 automated DNA sequencers. These were still run on gels, but lasers detected fluorescent chain terminators – hot lab work no longer required.

Sanger sequencing dominated for a quarter of a century with variations on a theme, and capillary sequencing drove the required throughput of the Human Genome Project. The Maxam-Gilbert technique3, while having an automation protocol published4, was never developed as a commercial concern.

March of progress.

The Platforms & Pipelines Group at the Earlham Institute now have a mix of platforms in their labs, closing our capillary-based Sanger sequencing operations two years ago. There is still a demand for the technology, but it no longer supports the kind of investigations that our customers and Science Faculty need.

It’s not just the Sanger genome sequencing platforms that we have carefully mothballed, but also iterations of ‘next-generation’ sequencing machines – our Roche 454’s and Illumina GAII are museum pieces already, but also a visual reminder to keep around the lab to show the march of progress to visitors.

What takes their place is a mix of platforms, from the Life Technologies Ion Proton, a large suite of Illumina machines, the Pacific Biosciences RSII to the Oxford Nanopore MinIONs, which aren’t currently commercially available.

EI’s advanced suite of platforms represent a mix of sequencing by synthesis and direct sensing covering optical and semiconductor detection.

Proliferation of platforms.

But why do we need the proliferation of platforms?

Quite simply they all target different applications in the lab. The MiSeq’s are attractive targets for 16S amplicon studies and small bacterial genomes. The HiSeq’s support high-throughput BAC sequencing (for the modern equivalent of the genome projects to generate sequence across minimal tiling paths), RNA-Seq and also exome sequencing across a range of plants.

The PacBio is another ideal platform for generating and supporting genome assemblies, but has exciting applications via the Iso-Seq protocol to aid genome annotation efforts using RNA-Seq for full-length transcript sequencing.

Scaffolding genomics.

There are also the other machines, sitting in the lab whose purpose is not to sequence DNA, but to provide scaffolding information to use in more complex genome assembly projects.

They look familiar in that they have optical detection, and flow cells, but they take images of strands of copied and labelled DNA as they migrate across the field of view.

Our platforms, such as the OpGen Argus and BioNano Genomics Irys peer into the structure of single molecules of DNA, allow us to polish existing reference assemblies, create new ones and providing information on larger structural variability from one organism to another.

Software landscape.

One of the challenges of utilising these platforms is the requirement for their input DNA. These can be heavily trimmed (e.g. in exome sequencing) or need very gentle handling to retain high molecular weight (for the optical mapping and the PacBio), or finely separated for mate-pair libraries.

With a profusion of read lengths across the Illumina platforms, each library needs to be tailored to both its application and machine destination. The ability to accurately generate DNA fragment sizes, and subsequently QC them, requires investment in a number of platforms – Diagenode Megaruptors, Covaris AFA ultrasonicators, Agilent TapeStations, Sage ELF and BluePippins are critical parts of the lab workflow.

The software application landscape is now changing to reflect this.

Another world.

Bioinformatics is transitioning out of its period of short-read introspection and living in a world where Sanger length reads (and beyond!) can be generated in huge volumes.

EI's Platforms & Pipelines’ bioinformaticians are required to support an ever-expanding range of genome sequencing platforms, protocols and investigation types – all with their protocol-specific quirks.

When I walk around our lab I see automation and specialisation everywhere I go; technology and techniques that twenty years ago, at the start of my PhD, would have seemed entirely otherworldly. I’m acutely aware that the few kilobases of mouse genome that I sequenced by hand would barely be a footnote in the daily output of any of our modern genome sequencing machines at EI. It’s a great time to be a scientist in the genomics field – and without a drop of radioactivity in sight!

Where will genome sequencing be in another 20 years?

References

Southern, Edwin Mellor (5 November 1975). “Detection of specific sequences among DNA fragments separated by gel electrophoresis”. Journal of Molecular Biology 98 (3): 503–517.
Sanger F, Nicklen S, Coulson AR (December 1977). “DNA sequencing with chain-terminating inhibitors”. Proc. Natl. Acad. Sci. U.S.A.74 (12): 5463–7.
Maxam AM, Gilbert W (February 1977). “A new method for sequencing DNA”. Proc. Natl. Acad. Sci. U.S.A.74 (2): 560–4.
Boland, EJ; Pillai, A; Odom, MW; Jagadeeswaran, P (Jun 1994). “Automation of the Maxam-Gilbert chemical sequencing reactions.”. BioTechniques16 (6): 1088–92, 1094–5.

How did genome sequencing change so rapidly?

Not hot.

March of progress.

Proliferation of platforms.

Scaffolding genomics.

Software landscape.

Another world.

References

Related reading.

Genetic integrity needed for Biodiversity Net Gain to flower

Light-up plants and tunable roots signal new solutions for climate crisis

How the latest platforms are scaling-up our impact in aquaculture

The fish, the fungus, the grass, the bee - and the brassica

COPO: providing context through metadata

Standout innovation contributes to knowledge exchange

Applying spatial transcriptomics in plants

Collaborating for our future

Focus on fungi helps fight global threat to our food

Heart patients set to receive treatment tailored to their genetic and health information

New genome assembly finds yeast variant is distinct species

Science and Technology Secretary announces Engineering Biology investment

Identifying criminals from a single cell

£3m funding for project to chart cellular diversity on Earth

Mysterious microbiomes to get makeover under transformational £5.4M grant

Purple Bar moth is 1,000th species sequenced in landmark project

Scientists one step closer to rewriting world’s first synthetic yeast genome