Article Science

New perspectives on human health and biodiversity using cell atlases

Scientists have been characterising and classifying cells in increasing detail for centuries. From the development of new disease treatments to the genetic engineering of crops designed to withstand climate change - cells could contain many of the answers to major global challenges.

09 September 2025

Most cells don’t exist in isolation - their environment and neighbouring cells often have a significant impact on function and behaviour. 

A cell atlas is designed to give a comprehensive insight into this impact - enabling scientists to map structural features of cells, tissues, and organs to build a picture that wouldn’t be visible by studying cells in isolation. 

Professor Irene Papatheodorou is Head of Data Science at the Earlham Institute. Her group develops tools and methodology to integrate data, standardise metadata, build cell atlases, and analyse these atlases across different species.

Our team works across multiple domains - from health, agriculture, and biodiversity - with a focus on developing computational methods that are needed for multiple biological questions,” explains Irene.

Spatially resolved image of lung tissue from Dr Sonia Fonseca at the Earlham Institute

Spatially resolved image of lung tissue from Dr Sonia Fonseca at the Earlham Institute

Why is a cell atlas useful?

Cell atlases provide a powerful resource for a wide range of biological research. They can allow scientists to compare cell types across healthy and diseased states, or at different stages of development. Such comparisons can reveal pathogenic cell populations and highlight potential therapeutic targets.

Cell atlases can also be used to examine how the abundance of specific cell types changes under different conditions—such as environmental stress, disease progression, temperature sensitivity, or immune response.

By identifying and characterising individual cells and their functions, cell atlases help researchers understand how organisms work, understand the differences between health and disease states, and ultimately develop targeted treatments.

“What really fascinates me is the diversity of different cellular interactions, the robustness and plasticity of cells that is being uncovered by generating and studying these atlases.

Our recent work in Lichen (a symbiotic system) has been particularly fascinating, to see how cells from unicellular organisms interact and specialise to make up a multicellular organism,” says Irene.

How is the Earlham Institute using cell atlases?

At the Earlham Institute, we combine our computational expertise in single-cell genomics with cutting-edge data science and AI to work on global cell atlas projects.

Scientists in the Papatheodorou Group work across the pipeline of atlas curation, including metadata management and study integration, to data analysis,visualisation, and inference.

We’re collaborating widely with partners in the Biodiversity Cell Atlas, Human Cell Atlas, in particular atlases for Gut and Crohn’s disease, as well as comparing species (human, plant, mouse, fruit fly, lichen) to identify cell-type-specific gene expression patterns.

Cross-species analysis allows scientists to map and compare how cell function and gene expression differ across evolutionary distances, helping us better understand the relationship between gene expression, phenotype and function. This insight will aid global initiatives to conserve biodiversity by increasing our understanding of different species vulnerabilities given environmental stressors.

The Group is also contributing to the Plant Cell Atlas (PCA) initiative, which is comprehensively describing plant cell types and their spatial organisation using high-resolution data, by developing our own single-cell sequencing and analysis platforms.

The compiled atlases can help to answer complex biological questions about cellular function, diversity, and evolution - how cells differ across species, and understanding the composition of tissue within an organism.

In the long-term cell atlas analysis could help with the development of biomarkers for different diseases, repurposing existing drugs for multiple diseases. In agriculture it could enable a better understanding of the complex interactions between pests and the plants in order to develop targeted treatments.

“It’s really exciting to see the new experimental methods being coupled with developments in machine learning and AI, and how they are really increasing our insights in such a rapid way,” Irene adds.

What really fascinates me is the diversity of different cellular interactions, the robustness and plasticity of cells that is being uncovered by generating and studying these atlases.


 

Prof Irene Papatheodorou

Prof Irene Papatheodorou, Head of Data Science at EI

Prof Irene Papatheodorou, Head of Data Science at the Earlham Institute leads projects working with a number of cell atlas consortiums including the Plant, Human, and Biodiversity Cell Atlas projects.

Global challenges

The cell atlas community holds the potential to unlock important new insights but bringing together these vast datasets and multiple studies from across the globe come with inherent challenges.

One of the major challenges facing the community is the integration of datasets generated using multiple technologies, sampling methods, and annotation standards. Differences in sequencing platforms, library preparation methods, and computational pipelines can introduce unwanted variation, while inconsistent metadata standards make it difficult to compare and reproduce studies across cell types.

As part of the Earlham Institute’s Cellular Genomics research programme, data scientists are currently working on metadata standards for the single-cell cell atlas community, developing scalable and FAIR (findable, accessible, interoperable, reusable) systems for large-scale data integration, and pipelines to harmonise and analyse single-cell data.

“Missing and inconsistent metadata poses a big problem in the community, impacting both the sharing of raw sequencing data and integrating atlases. Tools such as COPO are helping to address these challenges and support the community, but another challenge we face is the integration of atlases, and the wider dissemination and usability of the resulting trained AI models ” says Irene.

International collaboration is critical to address these challenges, you need a range of expertise to understand the various challenges from different perspectives.

“From designing and interpreting the experimental methods, and the metadata generated, to understanding the limitations and strengths of the machine learning models. It requires a range of computational expertise from bioinformatics to statistics and machine learning experts who can examine and interrogate the limits of data integration, alongside the software engineers who can develop the scalable tools for handling ever-growing datasets,” adds Irene.