Cloud Computing Hero

Cloud Computing Infrastructure for Data-Intensive Bioscience

Working with us

Led by: Dr Rob Davey, Head of e-Infrastructure

Enquiries: If you are interested in working with Earlham Institute's e-Infrastructure team on a project or research, please contact us.

Cloud computing technologies can support research groups with their compute, storage and data management needs. They provide cost-effective and scalable data storage alongside compute services that are close to the data, alleviating data transfer costs and providing an excellent secure environment for collaborative analysis.

If you are a UK-based researcher, you can get in touch with us to request access to compute resources, such as virtual machines, and we’ll be happy to recommend our services, and share our expertise and know-how.

We offer:

  • Resources tailored to your needs, from data storage and flexible virtual machines with access to an HPC environment for large-scale analysis, to a software-ready platform for your computing or web hosting needs.
  • Support for complex orchestration of different systems using Docker Swarm or Kubernetes, allowing scalable solutions that can adapt over time and requirements.
  • Close collaboration with leading life science researchers.
  • In-house expertise in genomics and bioinformatics.

Our cloud computing resources are available via CyVerse UK, hosted at Earlham Institute. We provide various access levels and services (including infrastructure, platform and software as a service) just like similar commercially-available platforms.

However, unlike other service providers, we have in-house and in-depth life science expertise in genomics and bioinformatics that are reflected in our infrastructure. For example, software tools that are in use routinely by our own researchers are also available in our cloud, making it easy to gain access to more than 130 pre-installed bioinformatics applications.

Furthermore, in contrast with commercial solutions, we do not charge network ingress and egress fees making it cheaper to transfer data in and out of our research environment. Finally, we are able to host externalised websites and web services so that you can run project websites and tools quickly and easily with dedicated domain names.

We are dedicated to provide value-for-money expertise and support to the UK bioscience community - our service is subsidised by strategic funding from the BBSRC, part of UK Research and Innovation (UKRI), to reduce the financial barrier to cloud data sharing for life science and to enable the ease of collaborations and knowledge exchange between academia and industry.

Our platforms and technologies

Access to virtual machines (VMs), data storage and high performance computing (HPC) via CyVerse UK

CyVerse UK is a cloud computing infrastructure that provides access to bioinformatics services, tools, compute resources and system administration. We have detailed some of what we offer below, and you can find out more at the CyVerse UK website, or by getting in touch.

 

Virtual Machines

If you require additional computational power or a full Linux environment for development and analyses we can provide you with a custom Virtual Machine hosted in our secure, private cloud. Contact us for more information, or read more on the accessing virtual machines page on the CyVerse UK website.

A typical virtual machine might include:

  • 1-8 vCPUs
  • 4-32GB RAM
  • 5GB local VM storage
  • Backed-up and mirrored Network Attached Storage
  • Access to a dedicated HPC pool powered by HTCondor
  • Access to a software catalogue of common bioinformatics tools

High performance computing

Virtual machines can be configured to be able to submit larger jobs to CyVerse UK’s dedicated compute server pool powered by HTCondor (HPC resource with ~2TB ram and ~150 cores).

Data Storage

You can store your data in our CyVerse UK Data Store, a storage resource that can enable collaborative data sharing through various access mechanisms, to give you geographical advantages with respect to access speed, and also if you have legal requirements that dictate your data has to remain within the UK/EU jurisdiction.

Data Commons

Using the Data Store, users can store and share data publicly alongside a unique persistent URL, for example to include in connection with a publication. This allows other users of CyVerse UK to gain access to and analyse your data immediately, which aids reproducibility and open science.

Data commons

A dedicated user-friendly graphical interface with functionality to access and manage your data in the Data Store, run, save and share your analyses and pipelines, and share jobs and data with collaborators. If you are a developer, CyVerse UK can help you share your software with the world as runnable tools, enabling others to gain access to your software.

Web Hosting

CyVerse UK can provide virtual environments to host web sites and services within a typical web domain structure. You can learn more on our dedicated web hosting page.

Working with us

Get in touch with us today to find out how we can help you.

If you are a BBSRC-funded researcher or work in the wider UKRI/HEI sector, we may be able to offer services for free or at cost, and we are happy to assist with costing of our services into research grants. If you are an industrial or commercial customer, we offer a range of pricing for services. For all customers, we are happy to discuss access provided through collaboration agreements.

Wheat ears in hands ID145531654

Case Study: Achieving sustainable wheat through data infrastructure

Secure data storage, sharing and collaborative access pose a great challenge when dealing with large and complex research datasets and analysis. This is especially true for Designing Future Wheat (DFW), the first flagship cross-institute programme of its kind funded by BBSRC, bringing together biologists, breeders, and informaticians spanning eight research institutes and universities.

Within DFW, the Earlham Institute’s main responsibilities are in genomics data generation and analysis, and the provision of computational infrastructure and tools. This includes the Grassroots data management platform, developed at the Institute as an interoperable data sharing platform and repository to standardise access to wheat data.

Working with colleagues at the EI as part of the DFW data management work package is proving extremely useful; their database expertise, and willingness to take on ideas and suggestions is resulting in a unique and widely applicable data management solution.