Long-read RNA sequencing innovation at Earlham Institute

Long-read RNA sequencing is revolutionising genomics. It allows scientists to explore the genome in ways that were previously hidden, offering greater insights into biology, disease, and evolution.

The Earlham Institute has been an early adopter of this technology, as one of the first scientific institutes in Europe to use long-read sequencing in multiple research applications.

Long-read sequencing overcomes key limitations of traditional short-read sequencing, which is akin to shredding a document into countless tiny strips and then painstakingly piecing them back together - a process that can lead to misplacement or loss of important information.

RNA molecules - known as transcripts - are produced through gene expression and can vary significantly due to alternative splicing, where a single gene gives rise to multiple RNA isoforms. Transcript diversity is incredibly difficult to observe using short-read sequencing.

Long-read sequencing can overcome these challenges by capturing much larger molecular sections, making it significantly easier to reconstruct full-length transcripts which create a more complete picture, highlighting the critical role of isoform diversity and transcript regulation in complex biological systems.

Long-read sequencing has enabled the discovery of thousands of previously unannotated transcripts with significant impact. Alternative splicing - a key driver of transcript diversity - can profoundly impact gene function, and its dysregulation is frequently linked to diseases such as cancer and neurological disorders.

Scientist wearing blue protective gloves uses a precision pipette

Postdoctoral Researcher Anita Scoones delivers training at EI during the single-cell RNAseq bioinformatics course

Our impact

Our researchers work collaboratively with technical specialists in long-read RNA sequencing to study a diverse range of world changing science.

New perspectives on single-cell analysis

Earlham Institute has led development of single-cell long-read RNA sequencing protocols over the last five years, collaborating with multiple academic and industry leaders.

Scientists in the Macaulay Group are combining single-cell methods with long-read sequencing to investigate the role of alternative splicing events on cell differentiation in both health and disease.

The team have also worked with genetic therapy company ISOgenix in support of their mission to transform the lives of patients with serious diseases through isoform-based therapies.

Sharing their expertise in RNA sequencing and technology development, scientists helped adapt existing technology and workflows to identify protein isoforms from human tissue samples.

LongTREC PhD Fellow Francisco Cervilla Martinez (left), Postdoctoral Researcher Anita Scoones (middle) and Research Group Leader Iain Macaulay (right)

Building an international community

The Institute is also one of the partners on the LongTREC PhD programme - a Marie Skłodowska-Curie Doctoral Network collaboration developing methods and tools for the analysis of transcriptomic data using the most recent single-molecule, long-reads sequencing technologies.

Advances in long-read sequencing technologies are transforming transcriptome research for gene expression studies, but challenges persist in the technical and bioinformatics processing.

LongTREC brings together 11 leading research institutes and organisations - including Oxford Nanopore Technologies - to train the next generation of computational biologists to develop novel applications from transcriptome data.

Researchers from the LongTREC programme at Earlham Institute during the 2025 Bioinformatics Summer School training

Students on the LongTREC programme in the Darwin Training Suite at EI during the bioinformatics summer school

Drug discovery

Funded by the international Psychiatry Consortium, a £4 million collaboration between seven global pharmaceutical companies, and two leading research charities, convened and managed by the Medicines Discovery Catapult, supports high-value drug discovery projects in this area of unmet patient need.

Using long-read sequencing, it may be possible to identify the proteins that represent the most promising drug targets for the potential treatment of schizophrenia, how they affect the function of cells, and begin to develop drugs to alter their function.

Technology advancement

Our state-of-the-art, BBSRC-funded National Bioscience Research Infrastructure in Transformative Genomics supports the full pipeline of long-read sequencing - from sample preparation and quality control, through to sequencing using Oxford Nanopore PromethION and PacBio Revio platforms, and downstream bioinformatic analysis across plant, animal, and microbial systems.

The expertise of our scientific and technical specialists positions the Earlham Institute to continually explore and evaluate novel technologies, driving new research applications and delivering cutting-edge solutions to the wider life sciences community.

Skills development

The Earlham Institute offers comprehensive training in long-read single-cell RNA sequencing, equipping researchers with the skills and knowledge needed to design, perform, and analyse long-read single-cell RNA-seq experiments from end to end.

Delivered through a blend of conceptual lectures, methodological sessions, and hands-on bioinformatics training, the courses covers the full analytical workflow - from quality control, mapping, and annotation through to quantification, dataset integration, single-cell clustering, and data visualisation - all underpinned by best practice guidance drawn directly from the Institute's own faculty expertise.

By combining world-class technology development, deep biological expertise, and a commitment to collaboration, the Earlham Institute has become a hub for long-read RNA sequencing innovation.

From uncovering hidden transcript diversity to mapping complex genomic regions and accelerating drug discovery, our work is helping scientists see biology in unprecedented detail. As we continue to refine techniques and train the next generation of researchers, we are opening the door to discoveries that will transform scientific understanding.

Research Highlights

2015: Dr Iain Macaulay and Dr Wilfried Haerty (at the time at the Sanger Wellcome Institute and Oxford University) applied long-read sequencing to single cell cDNA to identify a gene fusion event.

2020: In collaboration with Oxford University, Institute researchers annotated and quantified transcripts arising from a major target for neuropsychiatric disorders, revealing unexpected transcript diversity.

2021: The Institute held the long-read RNA symposium, a virtual three‑day workshop targeting users of long‑read RNA‑seq—offering case studies, sample prep tips, bioinformatics for annotation and differential expression, and discussion of advanced applications.

2022: In collaboration with the University of Oxford, Institute researchers developed pipelines to identify and quantify novel transcript isoforms using long‑read RNA sequencing in cell differentiation models, discovering approximately 2,560 novel transcripts.

2023: Researchers at the institute apply single cell long read RNA-Seq to assess transcript expression across the haematopoiesis pathway identifying cell type specific isoforms.

2025: Earlham Institute researchers, in close collaboration with the Earlham Institute's National Bioscience Research Infrastructure in Transformative Genomics and researchers at the University of Oxford, compared existing single cell long read protocols. They highlighted the power of applying long read RNA-Seq to the single cell level by identifying cell type specific isoforms, as well as identifying areas where challenges remain.