Coordinating bioinformatics data, tools, standards and training in the UK, for Europe and beyond.
Biology is becoming a big data science. Life scientists are generating large data sets, for example of gene sequence, gene expression, genome modification, in experimental populations of plants or animals and from microbial communities. Making the best sense of these data sets is a key objective of the Earlham Institute.
The kinds of life science big data vary greatly in their type - ranging from multiple genome sequences to geo-located data from field trials. One of the key challenges in the next decade will be fusing this massive range of data to allow new knowledge to be inferred from disparate data sets. The challenges of biological big data go beyond the usual big data challenges (volume, variety, velocity) because of the wide range of tools and data sources that are available.
ELIXIR is working towards an “internet of biological data” that will allow data from these many different sources to be analysed using standard data descriptions and well-characterised tools. Ultimately this will give rise to an integrated platform upon which new rich analyses will be built.
EI has great strengths in data infrastructure for genomics and bioinformatics. The aim of the ELIXIR project is to build on these strengths to take on a leadership role within the UK community as a whole. EI acts as the lead institute for the UK ELIXIR node and as such ensures the smooth running of the node and its interactions with the ELIXIR hub in Hinxton.
A major responsibility of ELIXIR-UK at EI is to develop the activities of the UK node by widening participation of UK groups in ELIXIR. We coordinate the gathering of Expressions of Interest from groups offering data resources, analysis tools, standards and interoperability platforms, and training resources in bioinformatics and computational biology. Through our Scientific Development Group we aim to bring as many of these as possible into the ELIXIR fold.
Another important role, currently in development, is community building across the life sciences. We aim to bring together communities of shared interest, for example in bioinformatics training, interoperability and standards, agriscience-related data and human health and disease data, to develop strategies for improved connectivity and participation in ELIXIR and related initiatives.
Gaudet P, Bairoch A, Field D, Sansone SA, Taylor C, Attwood TK, Bateman A, Blake JA, Bult CJ, Cherry JM, Chisholm RL, Cochrane G, Cook CE, Eppig JT, Galperin MY, Gentleman R, Goble CA, Gojobori T, Hancock JM, Howe DG, Imanishi T, Kelso J, Landsman D, Lewis SE, Mizrachi IK, Orchard S, Ouellette BF, Ranganathan S, Richardson L, Rocca-Serra P, Schofield PN, Smedley D, Southan C, Tan TW, Tatusova T, Whetzel PL, White O, Yamasaki C; BioDBCore Working Group.
The following UK institutions are involved in ELIXIR-UK:
- The Earlham Institute (Neil Hall)
- Cardiff University (Robert Andrews)
- The University of Dundee (Geoff Barton)
- University of Leicester (Tim Beck)
- The University of Nottingham (Tania Dottorini)
- The University of Manchester (Carole Goble)
- Heriot-Watt University (Alasdair Gray)
- The Centre for Ecology and Hydrology (Rob Griffiths)
- University College London (Christine Orengo)
- The University of Liverpool (Steve Paterson)
- University of Bradford (Krzysztof Poterlowicz)
- Rothamsted Research (Chris Rawlings)
- The University of Cambridge (Gabriella Rustici)
- The University of Oxford (Susanna-Assunta Sansone)
- Imperial College London (Michael Sternberg)
- The University of Edinburgh (tbc)
- The University of Birmingham (Ralf Weber)
- Newcastle University (Anil Wipat)
Lead contacts are given for each, although some are involved in more than one ELIXIR-UK resource or service.
ELIXIR operates across a wide range of life science data from human genomics to large-scale agricultural and environmental data. Its impact will be in the greater availability of this data and its provision in more standardised forms.
This will facilitate basic and applied academic research and industry innovation - from more efficient identification of genes and processes involved in rare diseases to improved crop traits - by linking disparate data from a wide range of sources.