• Article
  • Learning
  • CSP

Students need to up their bioinformatics game: why I am learning to code python

UEA undergraduate in Molecular Biology and Genetics Georgia Whitton believes it’s paramount for students to learn the importance of bioinformatics, and has some top tips on the best ways to learn to code.

April 27, 2020

UEA undergraduate in Molecular Biology and Genetics Georgia Whitton believes it’s paramount for students to learn the importance of bioinformatics, and has some top tips on the best ways to learn to code.

When Georgia Whitton was unsuccessful in getting a summer placement at EI last year, she wasn’t disheartened. Instead, she spent the summer learning how to code. Now, for her third year undergraduate research placement, she’s working as a bioinformatician in the very same lab that said no - and contributing to work set to be published in the near future.

EI spoke to Georgia to find out why bioinformatics has rekindled her hopes to pursue a career in science, how she learned python, and why she thinks it’s paramount that students switch on and realise how important it is to be doing bioinformatics and learning how to code.

How did your journey with Earlham Institute begin, to get you to where you are now?

Last summer, when I was looking around for internships, I got chatting to an EI alumnus at the cocktail bar I work at on weekends. He told me about the research assistant experience placement at EI and helped me craft an application (thanks Harbs!). I got through to the interview stage, and as much as I thought I aced the interview, it was clear I lacked the technical skills to take on such a data heavy project.

Whilst relaxing on the Norfolk beaches, wondering what on earth I was going to do with my summer, I requested some post-interview feedback from the PI, Wilfried Haerty, who I had applied to, and he suggested I learn to code… And that was that!

I spent my summer completing various online coding courses, I attended software workshops, and started reading around on tools and languages. I had realised that not being able to code was a barrier and I was determined to break it down.

When returning to UEA for my final year, I was delighted when the list of potential supervisors for our Research Projects included various supervisors from the Earlham Institute and other sites across the NRP but I was even more delighted to see Wilfried’s name on that list. Perfect! I’m back and I’m better.

Before learning to code I would not have been ready to do a bioinformatics project, let alone even consider one. But now, equipped with my new knowledge and interest in coding, I was eager to apply this biologically at EI.

What are you working on?

I’m currently in the Haerty Group working on characterising alternative splicing patterns in voltage gated calcium channels in the human prefrontal cortex across development, between gender, ethnicity, and between controls and cases of schizophrenia. CACNA1C is a psychiatric risk marker for Schizophrenia and codes for the alpha 1 subunit of voltage gated calcium channels, and is expressed in multiple locations in the body, including the cardiac system, but particularly in the brain. With alternative splicing activity particularly prevalent in the brain, investigating the effects of alternative splicing in the human prefrontal cortex on CACNA1C is incredibly exciting.

In collaboration with Prof Elizabeth Tunbridge at the University of Oxford, Wilfried and his team recently used Oxford Nanopore’s MinION to accurately annotate the transcript of CACNA1C revealing 38 novel exons. I’m using this novel transcript to mine through a large dataset of RNA-Sequence data from the Lieber Institute of Brain Development in Baltimore, spanning pre-birth up to old age, to investigate whether these different transcripts produced by alternative splicing are differentially expressed.

The treatment options for people with schizophrenia are currently limited, with varying side effects from different antipsychotics. Many of these people will be trialling multiple medications with little benefit for months on end. That’s something I hope my work can help to change. Not only is it exciting to be working on an area of science contributing to improving human health but it is also a great honour to be a part of the legacy of those individuals in Baltimore who donated their brains to research.

Georgia presenting her research project at the UEA WISE Conference

Georgia Whitton, PhD student at EI, presenting her research project at the UEA WISE conference
Open quote marks

It is also a great honour to be a part of the legacy of those individuals in Baltimore who donated their brains to research.

Closing quote marks

How did you learn to code?

There are so many online learning platforms out there so it was a struggle to know where to go. I ended up starting on Codecademy, who offered a free trial. I actually signed up 6 times with different email addresses for free accounts before committing to purchasing a yearly membership but I wanted to be sure that my hard-earned money was well spent.

Codecademy was so useful to me (and still is!). The website GUI is really helpful, there’s lots of help on YouTube, and various exercises to download and complete on your own machine. They also have an app, where you can revise concepts on flash cards and have mini quizzes. After completing Learn Python 3, Learn the Command Line and Learn R I was feeling much more comfortable with it all.

After I had essentially “learned my coding alphabet” I just threw myself at any extra-curricular coding experiences I could find. I signed up for a free week-long workshop on SLiM genetic simulation software, which Ben Haller from Cornell University hosted at UEA. Although extremely out of my depth and surrounded by postdocs and PhD students from the UK and abroad, throwing myself in at the deep end was an invaluable experience and got me hooked on the idea of using computational tools in biology.

To further increase my exposure to coding in general, I also completed a Code First:Girls coding course in HTML and Javascript where I learnt the basics of web development and importantly was first introduced to Github, GIT and version control.

Over the Christmas break, I worked on some bioinformatics exercises available on the internet from technical interview questions in preparation for my time at EI. The Institute hosts lots of workshops and training events I wasn’t aware of before starting here. I attended the Software Carpentry workshop in January 2020 and was surprised by how much I had taught myself, but the professional training was brilliant and helped me understand concepts in greater detail.

Georgia (pictured front-right) networking with other participants at the SLiM genetics software workshop

Georgia Whitton, PhD student in Haerty group at EI, with other participants at the SLiM genetics workshop
Open quote marks

After I had essentially “learned my coding alphabet” I just threw myself at any extra-curricular coding experiences I could find.

Closing quote marks

How easy was it to learn to code at first?

At the beginning it felt relatively easy as all the introductions start off very basic.

X=1

Print(X)

> 1

I’m a pro! Sort of…

It’s like learning any new skill, like learning to bartend or learning to paint; at the start you are ignorantly incompetent and this stage of learning is great! It’s all very new and exciting, but as you begin to understand more of what you are doing, you gradually become fully aware of how little you know and the breadth of what there is to learn seems very very immense.

I must admit, I found learning to code quite addictive and that definitely makes it easier to develop.

What is the hardest thing about coding?

For someone who’s very determined, I contradictingly suffer with low self-confidence, so for me the hardest part about coding was feeling as if I was not progressing. My supervisor suggested I keep a weekly log of my personal and technical development which has been a great idea. Some weeks I feel like I’ve not progressed very far, but after looking back at my log, the problem I’m trying to solve wouldn’t have even made sense to me the week before.

Perhaps the hardest part of coding for me is being able to visualise in my mind what I want to write but lacking the knowledge of how to build it. Thanks to Google, however, everything you could need to know is out there somewhere, you just need to figure out where to find it.

Open quote marks

My supervisor suggested I keep a weekly log of my personal and technical development which has been a great idea.

Closing quote marks

How are you applying what you learned to your current project?

I use Bash scripting at the command line to organise, manipulate and interact with my files. The data in my RNA-seq files is so large that High Performance Computing is essential. I’m using Python, particularly the pandas library, to manipulate my data and running downstream analyses on my splice counts using R.

As a Molecular Biology and Genetics student, it’s really great to use these computational tools on human data to answer biological questions that I’m interested in.

How would you describe your time at EI?

I’m thoroughly loving my time here! Being one of a few undergraduates in an institute is very exciting. I have my own desk which is pretty cool, and you’ll only see me away from it when I’m grabbing a coffee from the much-loved EI coffee machine.

My lab group is great, supplying not only great quality (i’m being polite here) jokes, but freshly baked goodies every week! The support, encouragement and enthusiasm here has been fantastic and I feel very lucky to have been able to do a project here and build invaluable technical skills.

Are you surprised that more students aren’t taking up this sort of opportunity?

Doing my project here might well be the best part of my degree thus far. Working inside Earlham Institute has allowed me to develop skills professionally, technically and personally, as well as giving me the ability to work with guided independence. It was a real shock when our research project co-ordinator at the university had to urge our cohort to utilise supervisors at the research park. Unfortunately, not enough students have discovered the wonders of bioinformatics just yet.

Bioinformatics is brilliant and I would urge any students with hesitations to just throw yourself in! I think it’s important to be able to analyse and understand your own data, and using bioinformatics to do that allows you to really deeply investigate your biological questions.

It’s also inherently collaborative. So many awesome tools and software are open source and scientists are constantly editing and improving these. Bioinformaticians are eager to help each other, which is why there are so many forums online with questions and answers to most of your problems. It is such an exciting field to be a part of.

What should universities be doing to promote coding and bioinformatics to students?

From my experience, universities need to go further than just making students aware of bioinformatics and actually showcase it as a discipline in its own right.

We had so little bioinformatics in our curriculum that, as we progressed further into our studies, it seemed more and more daunting and increasingly difficult to attempt. Some students would actively avoid it as a research project choice. Rather than training biologists to think “a bioinformatician will deal with my data”, universities should be training biologists to think “I will learn how to deal with my data using bioinformatics tools”.

In terms of coding, you can appreciate that some people still see this as even less relevant to biology! Coding is not only fun, addictive, and challenging but it’s also extremely transferable to many other industries and disciplines. In the long run, coding is a tool to make your life much easier when using a computer.

I’ve learnt so much over this past year, which has been so exciting, and I hope that other students can experience the fun of bioinformatics like I have.

Coding is a valuable and useful transferable skill that can be applied across many different industries

coding is a valuable and useful transferable skill that can be applied in many different industries
Open quote marks

Coding is not only fun, addictive, and challenging but it’s also extremely transferable to many other industries and disciplines.

Closing quote marks

What are your plans for the future?

Personally, I really didn’t fancy a wet-lab career and I was searching around for other science-related jobs but nothing was really calling out to me. Now I’ve come to learn that I can contribute to ground-breaking research in areas I’m interested in all whilst sitting at a computer.

Having only recently discovered the world of bioinformatics, I don’t have any plans set in stone. I am, however, certain that I want to continue learning to code and building on the fundamental skills I have developed during my project at EI.

If I can keep developing, I’ll give myself the best chance in the future of being able to operate in that exciting space between computing and biology in order to improve our understanding of human health and disease.

Article author

Peter Bickerton

Scientific Communications & Outreach Manager