Alumni

Rob Davey

Visiting Scientist

Prior to joining the Earlham Institute, Robert was a post-doctoral researcher at the Institute of Food Research (IFR) in the National Collection of Yeast Cultures (NCYC) group, providing tools to analyse the ribosomal DNA of Saccharomyces as well as general bioinformatics support to help drive this important National Capability. He completed his BSc in Microbiology (2001) and his PhD in Bioinformatics (2005), both at the University of East Anglia, the latter developing statistical algorithms and end user software for assessing the gene content of bacterial organisms using Comparative Genomic Hybridisation microarrays.

Robert joined EI in February 2010 as the lead software engineer on the MISO lab information management (LIMS) project, which was released to the community as an open source framework for tracking sequencing experiments in 2012. He subsequently became the Core Bioinformatics Project Leader, managing a team of programmers to advance MISO as well as developing new projects into data infrastructure and management, and the genomic data visualisation tool, TGAC Browser. Robert was appointed as Data Infrastructure and Algorithms Group Leader in late 2012, and continues to lead this Faculty group in researching how best to manage, represent and analyse data for open science, as well as exploring new hardware, algorithms and methodologies to develop tools to push the boundaries of data-driven informatics in the life sciences. The team applies their research expertise to develop infrastructure platforms for data and software dissemination and publication, assembly algorithms for viral and microbial metagenomics, large-scale data visualisation, and best practice and training in bioinformatics.

Robert's main interests are in enterprise-grade software development, with over 15 years experience in system administration, programming, and web service technologies. He enjoys researching data management and associated HPC infrastructure, sequence analysis and quality control pipelines, novel visualisation strategies for sequencing and biological data, metadata and the Semantic Web, and is an advocate of the open science ethos.

Publications

Awards

Rob Davey

Biography

Publications

LotuS2: an ultrafast and highly accurate tool for amplicon sequencing analysis

ISA API: An open platform for interoperable life science experimental metadata

Norwich COVID-19 testing initiative pilot: evaluating the feasibility of asymptomatic testing on a university campus

COPO: a metadata platform for brokering FAIR data in the life sciences

CerealsDB—new tools for the analysis of the wheat genome: update 2020

Colombia's cyberinfrastructure for biodiversity: Building data infrastructure in emerging countries to foster socioeconomic growth

A Galaxy-based training resource for single-cell RNA-sequencing quality control and analyses

COPO: a metadata platform for brokering FAIR data in the life sciences

TGAC Browser: An open-source genome browser for non-model organisms

CropSight: a scalable and open-source information management system for distributed plant phenotyping and IoT-based crop management

Aequatus: an open-source homology browser

GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees pipeline

Data management and best practice for plant science

An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations

GeneSeqToFamily: the Ensembl Compara GeneTrees pipeline as a Galaxy workflow

Awards

Rob Davey

Biography

Publications

LotuS2: an ultrafast and highly accurate tool for amplicon sequencing analysis

ISA API: An open platform for interoperable life science experimental metadata

Norwich COVID-19 testing initiative pilot: evaluating the feasibility of asymptomatic testing on a university campus

COPO: a metadata platform for brokering FAIR data in the life sciences

CerealsDB—new tools for the analysis of the wheat genome: update 2020

Colombia's cyberinfrastructure for biodiversity: Building data infrastructure in emerging countries to foster socioeconomic growth

A Galaxy-based training resource for single-cell RNA-sequencing quality control and analyses

COPO: a metadata platform for brokering FAIR data in the life sciences

TGAC Browser: An open-source genome browser for non-model organisms

CropSight: a scalable and open-source information management system for distributed plant phenotyping and IoT-based crop management

Aequatus: an open-source homology browser

GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees pipeline

Data management and best practice for plant science

An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations

GeneSeqToFamily: the Ensembl Compara GeneTrees pipeline as a Galaxy workflow

Awards

Related reading.

Five reasons why computing isn’t as scary as you think

Genetic integrity needed for Biodiversity Net Gain to flower

Light-up plants and tunable roots signal new solutions for climate crisis

How the latest platforms are scaling-up our impact in aquaculture

The fish, the fungus, the grass, the bee - and the brassica

COPO: providing context through metadata

Standout innovation contributes to knowledge exchange

Applying spatial transcriptomics in plants

Collaborating for our future

UK plant breeders to benefit from online research tools

New genome assembly finds yeast variant is distinct species

Nanosurgical tool could be key to cancer breakthrough

Science and Technology Secretary announces Engineering Biology investment

Identifying criminals from a single cell

£3m funding for project to chart cellular diversity on Earth