Genome Annotation Workshop 2020

This 2-day course will help to provide scientists with an overview of advances in Next Generation Sequencing (NGS) technologies, transcriptome assembly and hands-on experience with assembly software to build gene families.

Start date:

12 May 2020

End date:

13 May 2020


09h00 - 17h00


Darwin Suite, Earlham Institute

Registration deadline:

31 March 2020



About the event.

About this workshop

With the fast-moving pace of sequencing technology and the plethora of options available to you, it can be difficult to know which is the optimal platform for your biological question and therefore how to design your experiment for maximum return from your samples. Once you’ve got the data, how do you know if what you’ve got is good enough quality? We’ll take you through the latest developments in sequencing technology, identify common challenges in producing good RNA for sequencing, help you to understand what you can expect from your data and what you can do with it.

In this 2-day course, you will:

  • Get an overview of Next Generation Sequencing technologies, including the latest advances
  • Gain a deeper understanding of the benefits of each platform and have the confidence to select the right platform or combination of platforms for your experiment
  • Understand what to look for in a sample that will pass quality control and that will likely succeed in producing viable sequencing data, in particular expectations for pre-made libraries
  • Learn what to look for in your data using SAV software
  • Learn how to quality control your raw reads
  • Understand the breadth of ways to work with RNAseq data and therefore understand the importance of experimental design.
  • You will get an overview of Transcriptome assembly both de novo and using a reference sequence
  • Hands-on experience with Mikado and GeneSeqToFamily with Aequatus

Target Audience

Advanced PhD students and post-doctoral researchers who are undertaking projects involving the assembly of transcriptomes and leading to annotation of those assemblies.


You are expected to have experience with using the command line and will be looking to improve your awareness of different approaches and pipelines. It will be beneficial for you if you are familiar with Galaxy and its workflow system, if not please follow the tutorials here:




**Please note that this programme is tentative and may be subject to change. More details will be added over the coming months.**

Day 1 - 12 May 2020



08:30 - 09:00

Arrival and registration

09:00 - 10:00

Welcome and delegate flash presentations

10:00 - 10:30


10:30 - 11:30

Introduction to NGS technologies for Genome Annotation - Dr Karim Gharbi
Including an overview of platforms and theory of library construction

11:30 - 12:00

Practical considerations for isolating high-quality RNA - Leah Catchpole
Including overview of sample requirements and library quality controls (QC)

12:00 - 13:00

Data QC and overview of data formats - Christian Schudoma
Including QC approaches for different platforms, and useful tools for data conversion and filtering

13:00 - 14:00


14:00 - 15:15

De novo transcriptome assembly - Swarbreck group
Including a step by step workflow and comparison of methods

15:15 - 15:45


15:45 - 17:00

Reference-guided assembly - Swarbreck group
Including a step by step workflow and comparison of methods

17:00 - 17:15

Close of Day 1

Day 2 - 13 May 2020



08:30 - 09:00

Arrival coffee

09:00 - 10:30

Hands-on: Transcriptome assembly - Swarbreck group
Alternative methods using an example region, walk through of Mikado for integrating assemblies

10:30 - 11:00


11:00 - 12:00

Long read data processing - Swarbreck and Haerty groups
Including PacBio and Nanopore data processing

12:00 - 12:30

Case Study: Long Read Data - Wilfried Haerty

12:30 - 13:30


13:30 - 15:00

Genome Annotation - Building high quality gene models
Describing the challenges and alternative approaches

15:00 - 15:30


15:30 - 16:00

Assessing the quality of the genome annotation - Swarbreck group

16:00 - 17:00

Hands-on: Building Gene Families - Anil Thanki

17:00 - 17:15

Close of Day 2

