240 mammal genomes open for all in groundbreaking collaboration
11 November 2020
Earlham Institute Researchers have contributed to the launch of 240 mammal genomes, the first ever consortium-driven genomics project that allows anyone to access the publicly available data prior to analysis. The resource - which includes 120 completely new genome sequences - will vastly accelerate our understanding of human health and conservation genomics.
The data, published today in Nature, will allow a much more detailed understanding of the human genome - including the genes causing common and rare human diseases - that may lead to both better diagnostics and more specific treatment options.
The publicly available data has already proven useful in understanding the genetic basis of virus transmission in different species, including humans, bats and pangolins - an analysis which was done quickly in the wake of the COVID-19 pandemic.
“The objective is to provide a multitool for comparative genomics. To increase the coverage of the mammalia at the family level,” says Dr Will Nash, a postdoctoral researcher in the Haerty Group at Earlham Institute (EI). “The project covers about 80% of mammal families, which means our understanding of the mammals is more broad.
“We now have a wide, comparative dataset that allows us to understand the DNA bases that are under evolutionary conservation - that haven’t changed in a very long time in about 110 million years of mammalian evolution. We can really zoom in on single nucleotides that have stayed at that exact same position in the DNA sequence over all this time.”
Previously, researchers could only confidently look at preserved regions of about 12 DNA bases, 12 letters of the DNA code. With this new dataset - because there is such a wealth of data to compare from across this highly interrelated group of animals - we drill down to a single letter. We’re looking at the genome in much greater resolution. It’s like your new 12-megapixel iPhone camera images compared to the pixelated ones generated by an early smartphone.
To emphasise the usefulness of this, Dr Wilfried Haerty and collaborators at the University of Oxford are already using the new data to investigate changes in the DNA sequence of the regulatory regions in a type of calcium channel, mutations in which have been linked to schizophrenia and bipolar disorders. Understanding precisely where the changes are conserved, or specific for a certain condition, can help us to guide therapeutic interventions.
“The main driving force for the 200+ mammal genomes project was to get that resolution from 12 to 1 nucleotide,” says Dr Wilfried Haerty, group leader at EI, who says that while the resource is of great use for studying human disease, we can learn a whole lot more besides by shifting our perspective. “But, as is very much highlighted in the paper, instead of focusing on humans this allows us to stop thinking in such an anthropocentric manner.”
Among the 240 mammal genomes are 34 species of bat, 16 species of cetacean and more than 60 rodents. The power of the new resource will be to empower researchers to look at all species of interest with respect to any of the others, not just in relation to humans. That aspect will be of particular benefit to researchers looking to save species on the brink of extinction, which are undergoing severe genetic bottlenecks.
“Look at every genome project, every alignment available. It’s a human reference based alignment, or a mouse reference based alignment. It’s not based on anything else,” Haerty explains. “With the release of the 252 mammal genomes, it’s a different way of thinking. We can put the focus on other species.
“This is a dataset we’ll be using for years that will very much revolutionise the way we are thinking. It’s very exciting, both in terms of human health and for conservation genomics.”
Notes to editors
You can see more information about the 252 mammal genomes project, led by the Broad Institute as part of an international consortium of 28 institutes, at the Zoonomia website: https://zoonomiaproject.org/
Over half the samples, many of which include rare and threatened mammals, came from the San Diego Frozen Zoo - a unique collection of blood and tissue samples that includes animals that are now extinct.
For more information, please contact:
Scientific Communications and Outreach Manager, Earlham Institute (EI)
- +44 (0)1603 450 994
The Earlham Institute (EI) is a world-leading research Institute focusing on the development of genomics and computational biology. EI is based within the Norwich Research Park and is one of eight institutes that receive strategic funding from Biotechnology and Biological Science Research Council (BBSRC) - £5.43m in 2017/18 - as well as support from other research funders. EI operates a National Capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.
EI offers a state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute is a UK hub for innovative bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users and promoting applications of computational Bioscience. Additionally, the Institute offers a training programme through courses and workshops, and an outreach programme targeting key stakeholders, and wider public audiences through dialogue and science communication activities.