Article Learning Expert

Plotting proportional data as nested circles in R

A guide demonstrating how to make a static nested circle plot using the R packages packcircles and ggplot2

23 November 2023

This article is part of our technical series, designed to provide the bioscience community with in-depth knowledge and insight from experts working at the Earlham Institute.


For better or worse, you see proportional data represented with nested circles a fair bit in the media.

Plotting proportional data is almost always best done with some kind of bar chart but, occasionally, I do come across a case where I think nested circles convey a message well (as well as being visually attractive).

In this article I’ll demonstrate how to make a static nested circle plot using the R packages packcircles and ggplot2, although if you want interactivity you can also check out circlepackeR, which creates snazzy html widgets.

Dr Rowena Hill is a Postdoctoral Research Scientist at the Earlham Institute, studying the genome of the wheat root pathogen take-all, as part of the Delivering Sustainable Wheat (DSW) programme.

Rowena gained a PhD at the Royal Botanic Gardens, Kew, and Queen Mary University of London, focusing on the diversity and evolution of fungi and plant-fungal interactions. 

Packing circles

In the first place, you need a dataframe with the total and subset values you want to plot.

Here are some data I scraped off MycoCosm showing the number of genome assemblies available for different fungal lifestyles, which will be my larger circles. I then want the nested circle area to represent the subset of the total which have already been published - this is actually similar to a figure I created for my PhD thesis introduction!

head(mycocosm.lifestyles.df)

 

##       lifestyle num num.pub  colour
## 1     endophyte 142      85 #009E73
## 2    lichenised  16       8 #E69F00
## 3  mycoparasite 105      32 #F0E442
## 4   mycorrhizal 199     123 #56B4E9
## 5 phytopathogen 263     155 dimgrey
## 6    saprotroph 258     178 #0072B2

packcircles handles the creation of circles with area proportional to the numbers we give it.

This involves first generating a dataframe with the central point and radius of each circle.

library(packcircles)
#Get radius and x and y coordinates for centre of larger circles
circle.layout <- circleProgressiveLayout(mycocosm.lifestyles.df$num,
                                         sizetype="area")
#Optionally add a small gap between circles so they're not touching
circle.layout$radius <- circle.layout$radius * 0.95
head(circle.layout)
##           x          y   radius
## 1 -6.723095   0.000000 6.386940
## 2  2.256758   0.000000 2.143920
## 3  2.875424  -8.014137 5.492162
## 4  3.958529  10.072890 7.560930
## 5 16.394459  -1.676497 8.692137
## 6 -9.972836 -15.447186 8.609115

 

We can then generate a dataframe with enough vertices to plot a polygon that looks like a circle.

#Create a dataframe of vertices to draw each 'circle'
circle.vertices <- circleLayoutVertices(circle.layout, npoints=50)
head(circle.vertices)
##            x         y id
## 1 -0.3361547 0.0000000  1
## 2 -0.3865177 0.8004959  1
## 3 -0.5368122 1.5883674  1
## 4 -0.7846681 2.3511895  1
## 5 -1.1261766 3.0769318  1
## 6 -1.5559518 3.7541492  1

 

And finally plot the larger circles - we’ll use alpha to make them translucent so that they are distinguished from the nested circles we add later.

library(ggplot2)
library(tgutil)
#Plot circles
gg.circles <- ggplot() + 
  geom_polygon(data=circle.vertices,
               aes(x, y, group=id, fill=as.factor(id)),
               colour=NA,
               alpha=0.3) +
  scale_fill_manual(values=mycocosm.lifestyles.df$colour) +
  coord_equal() +
  theme_void() + 
  theme(legend.position="none") +
  ggpreview(width=4, height=3, unit="in")

 

6 circles plotted using the above R code

Add nested circles

Now we want to create the polygons for the nested circles, which essentially means repeating the above steps with the nested data.

#Get radius and x and y coordinates for centre of nested circles
circle.layout.pub <- 
  circleProgressiveLayout(mycocosm.lifestyles.df$num.pub,
                          sizetype="area")
#If you previously added a small gap between circles, make sure to
#do so again
circle.layout.pub$radius <- circle.layout.pub$radius * 0.95

 

However before creating the polygon vertices for these nested circles, we first need to replace the central points with those of the larger circles so that our nested ones overlay correctly.

#Replace x and y with that of the larger circles, but keep the
#same radius
circle.layout.pub <- data.frame(x=circle.layout$x,
                                y=circle.layout$y,
                                radius=circle.layout.pub$radius)

 

Now we can generate the vertices and add the nested circles to the plot.

#Create a dataframe of vertices to draw each nested 'circle'
circle.vertices.pub <- circleLayoutVertices(circle.layout.pub,
											npoints=50)
#Add to plot
gg.circles.nested <- gg.circles +
  geom_polygon(data=circle.vertices.pub,
               aes(x, y, group=id, fill=as.factor(id)),
               colour=NA) +
  ggpreview(width=4, height=3, unit="in")
6 circles plotted using the above R code

Add labels

Finally we can make another dataframe with information to label the circles.

#Combine original dataframe with the layout dataframe
circle.labels <- cbind(mycocosm.lifestyles.df, circle.layout)
#Add lifestyle labels to centre of circles
gg.circles.nested +
  geom_text(data=circle.labels,
            aes(x, y, size=num, label=lifestyle),
            fontface="bold") +
  scale_size_continuous(range=c(1.5, 3.5))
The same nested circles, now with labels on

Alternatively we could label with the original values or percentage published, and add a colour legend for the lifestyles.

#Add new column with percentage of published genomes
circle.labels$percent <- round(
  circle.labels$num.pub/circle.labels$num * 100
  )
#Add percentage labels
gg.circles.nested +
  geom_text(data=circle.labels,
            aes(x, y, size=num, label=paste0(percent, "%")),
            fontface="bold",
            show.legend=FALSE) +
  scale_size_continuous(range=c(2, 6)) +
  scale_fill_manual(values=circle.labels$colour,
                    labels=circle.labels$lifestyle) +
  guides(fill=guide_legend(
    nrow=3,
    direction="horizontal",
    title=NULL,
    label.theme=element_text(size=7, margin=margin(l=-3)),
    keywidth=unit(7, "pt"),
    keyheight=unit(7, "pt"))
  ) +
  theme(legend.position=c(0.7, 0.15))
The same nested circles, now with percentages on and the labels described in a key
## R version 4.2.2 (2022-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22621)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8 
## [2] LC_CTYPE=English_United Kingdom.utf8   
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tgutil_0.1.14     ggplot2_3.4.2     packcircles_0.3.5
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.9        highr_0.10        pillar_1.9.0      compiler_4.2.2   
##  [5] tools_4.2.2       digest_0.6.31     evaluate_0.21     lifecycle_1.0.3  
##  [9] tibble_3.2.1      gtable_0.3.3      png_0.1-8         pkgconfig_2.0.3  
## [13] rlang_1.1.1       cli_3.6.0         rstudioapi_0.14   yaml_2.3.6       
## [17] xfun_0.36         fastmap_1.1.0     withr_2.5.0       dplyr_1.1.2      
## [21] knitr_1.42        generics_0.1.3    vctrs_0.6.2       systemfonts_1.0.4
## [25] grid_4.2.2        tidyselect_1.2.0  glue_1.6.2        R6_2.5.1         
## [29] textshaping_0.3.6 fansi_1.0.3       rmarkdown_2.21    farver_2.1.1     
## [33] magrittr_2.0.3    scales_1.2.1      htmltools_0.5.4   colorspace_2.0-3 
## [37] labeling_0.4.2    ragg_1.2.5        utf8_1.2.2        munsell_0.5.0
Image
Profile photo of Rowena Hill
Article author

Rowena Hill

Postdoctoral Research Scientist