Collaborative OPen Omics (COPO)
COPO is a data brokering portal for plant scientists to describe, store and retrieve data more easily, using community standards and public repositories that enable the open sharing of results.
The plant science domain has seen the advent of increasingly high throughput “-omics” technologies, resulting in larger datasets being produced more quickly and cheaply than ever before. Whereas researchers and funding organisations are realising the benefits of data sharing, many scientists still do not use public repositories, choosing instead to store data privately or their own computers or in their organisation’s IT infrastructure. The reasons for this are many and complex, but include lack of understanding of where and how to deposit data, lack of common metadata, and a lack of funding to support archiving.
COPO is a platform that bridges the gap between plant scientists and public repositories. It enables aggregation and publication of research outputs, as well as providing easy access to existing data services comprising disparate sources of information via web interfaces and Application Programming Interfaces (APIs).
Despite the opportunities that data sharing offers for recognition and reuse, many scientists still do not use public repositories, choosing instead to store data in private infrastructure. This may be due to unfamiliarity with services and technology, lack of standards and common metadata, or a lack of funding to support archiving. The large number and size of datasets make them difficult to store, let alone download, making cloud-based analysis tools essential. However, submission formats to public repositories are heterogeneous, often requiring manual authoring of complex markup documents, taking scientists out of their fields of expertise.
COPO aims to streamline the process of data deposition to public repositories and data journals, by hiding much of the complexity of metadata capture and data management from the end-user. The ISA (Investigation/Study/Assay) infrastructure provides the interoperability between metadata formats required for deposition to repositories. Logical groupings of artefacts (e.g. experimental metadata and results, PDFs, raw data, contextual supplementary information) relating to a body of work are stored in COPO profile and represented by common open standards, which are publicly searchable. Bundles of data objects can be deposited directly into public repositories (such as the European Nucleotide Archive, DataVerse and Figshare through COPO interfaces.
1. Project website
The COPO project website is the central resource that connects the different components of the project e.g., project information and collaborators, application, documentation, project codebase.
2. Project code
COPO is an open source project, and the application code is made publicly available. We welcome contributions in the areas of issue reporting and code development that will help improve the platform for the research community.
https://github.com/collaborative-open-plant-omics
3. Project documentation and knowledgebase
The project documentation aims to provide an easy-to-follow guidance in all aspects of the system.
Etuk A., Shaw F., Davey R., Gonzalez-Beltran A., Johnson D., Rocca-Serra P., Kersey P., Bastow R., Denby K., Sansone S. (2017) COPO: A Data Stewardship Platform for Plant Scientists International Plant & Animal Genome XXV / January 14-18, 2017 - San Diego, CA, USA
Shaw F., Etuk A., Davey R. (2017) COPO - A Web Platform for "FAIR" Data in Plant Science The 18th Annual Bioinformatics Open Source Conference (BOSC 2017)
Davey R., Shaw F., Etuk A. (2016) Collaborative Open Plant Omics (COPO) Semantics for Harmonization and Integration of Phenotypic and Agronomic Data 9-13 May 2016, Montpellier, France
Shaw F., Etuk A., Davey R. (2015) COPO - Bridging the Gap from Data to Publication in Plant Science Bioinformatics Open Source Conference (BOSC) 2015
Shaw F., Etuk A., Davey R. (2015) COPO: Bridging the Gap from Data to Publication in Plant Science Interoperable infrastructures for interdisciplinary big data sciences (IT4RIs) 2015
Shaw F., Etuk A., Schneider V., Davey R., Gonzalez-Beltran A., Rocca-Serra P., Rocca-Serra S., Kersey P., Bastow R. (2015) COPO - Linked Open Infrastructure for Plant Data Semantic Web Applications and Tools for Life Sciences (SWAT4LS) Conference
Shaw F., Etuk A., Davey R. (2015) COPO - Linked Open Infrastructure for Plant Data SWAT4LS, Cambridge, Clare College, 7th-9th December 2015
Oxford e-Research Centre
The Oxford team mainly contributes elements of the open source metadata tracking framework ISA Tools , which is supported by an international community (ISA Commons). The team develops the metadata representation, providing the model and configurations that enable standards-compliant reporting, metadata deposition APIs and services, as well as conversion and validation services. These functionalities are being developed within and incorporated into the wider ISA-API project
Key Individual: Susanna Sansone
University of York
The York team deals with community liaison and provides a connection to key user groups of the COPO platform, assists with user testing and usability via the GARNet Network and the Global Plant Council
Key Individual: Katherine Denby
EMBL-EBI
Description of their contribution: The role of EMBL-EBI in COPO is to support data from COPO to the public archives, and to disseminate integrated data sets prepared through the COPO infrastructure through Ensembl Plants and other public resources.
Key Individual: Paul Kersey
With the renewed interest and push from all areas of bioscience to promote publicly available research, the COPO project will be a pioneering national and international effort to facilitate sharing of all aspects of plant research to the public. In particular, COPO aims to provide a solution to overcome the challenges in standards fragmentation by;
(i) fostering development, acceptance and implementation of reporting standards that are immediately suitable for plant research
(ii) limiting the range and variability of standards. This will have a direct impact on the development and maintenance costs for commercial and academic software developers of standards-compliant products.