Today sees the publication of the paper by Zhihao Ding, YunYun Ni, Sander Timmer and colleagues (including myself) on local sequence effects and different modes of X-chromosome association as revealed by the quantitative genetics of CTCF binding. This paper represents the joint work of three main groups: Richard Durbin’s at the Sanger Institute, Vishy Iyer’s at U. Texas, Austin and my own at EMBL-EBI. I’m delighted that this work from Zhihao, YunYun and Sander (the three co-first authors) that it’s finally come out, and want to share some aspects of the work that were particularly interesting to me.
RNA is now a first class bioinformatics molecule.
RNA research is expanding very quickly, and a public resource for these extremely valuable datasets has been long overdue.
Continue reading “RNA is now a first class bioinformatics molecule.”
A cheat’s guide to histone modifications
I was recently having lunch with Sandro, a charming Neapolitan computer science graduate doing a postdoc in my research group, who has a passion for great food and clean C code. We were discussing some recent aggregation results of histone modifications, and Sandro was bemoaning (verbally and non-verbally) the fact that all the histone modifications sounded “just the same”. I could relate to the sentiment, recalling my own journey into this world some seven years ago during the start of the ENCODE project when I first faced this bamboozling list of modifications.
CRAM goes mainline
Two weeks ago there was the announcement from John Marshall from Sanger for SAMtools 1.0 – one of the two most widely used Next Generation Sequencing (NGS) variant-calling tools embedded in hundreds if not thousands of bioinformatics pipelines worldwide. (The majority of germline variant calling happens either through SAMtools or the Broad’s GATK toolkit.) SAMtools was started at the Sanger Institute by Li Heng when he was in Richard Durbin’s group, and stayed at Sanger now under the watchful eye of Thomas Keene.
Scaling up bioinformatics training online
Bioinformatics has grown very quickly since the EBI opened 20 years ago, and I think it’s fair to say that it will grow even faster over the next 20 years. Biology is being transformed to a fundamentally information-centric science, and a key part of this has been the aggregation of knowledge in large-scale databases. When you put all the hard-won information about living systems together – their genome sequences, variation, proteins, interactions with small molecules – they are, potentially, incredibly useful. I say “potentially” because even the most pristine, large, interconnected data collection in the world isn’t worth much if people don’t know how to use it.
Continue reading “Scaling up bioinformatics training online”
“DNA” as a cultural icon
Over the years I’ve been fascinated to see where the word “DNA” and the iconic double helix turn up in everyday life. It’s become so commonplace that phases like “corporate DNA” are in common usage and the double helix has pride of place on beauty cream adverts and many other places. While it’s interesting to see genomics enter the cultural lexicon, I think the general understanding of what DNA does and does not do has rushed way beyond the scientific case. But can we contain the spread of the idea of DNA as a cop-out?
‘Big Data’, genetics and translation
Today sees the announcement of a new public–private partnership in science – the Centre for Therapeutic Target Validation – between EMBL-EBI, the Sanger Institute and GlaxoSmithKline (GSK). The collaboration is dedicated to developing a framework for biological target validation so that we can reduce the amount of time it takes to discover new therapies.
This is a really exciting initiative for me personally, both because the science is challenging and because I have been appointed as interim Head of the CTTV over the next year whilst we look for a long-term Head to steer the collaboration. It has already been a fascinating journey for me to understand the pharmaceutical industry in more depth, and really get to grips with an important scientific problem.
New media for science: 3 years in.
Around three years ago, I decided to use social media in order to engage – as a scientist – with a broader group of people. Since then, I’ve come to see platforms like Twitter and blogs as a way to reshape public scientific discourse – mainly for the better but with considerable adjustments. Recently, the 5000th person started following me on Twitter, which I think represents a kind of turning point and perhaps a good opportunity to reflect on the pros and cons of using these new channels to communicate with a wider audience.
Continue reading “New media for science: 3 years in.”
The Start of a Journey
Last week a new paper, “Policy challenge of clinical genome sequencing,” led by Caroline Wright and Helen Firth and on which I am a co-author, was published in the British Medical Journal. It lays out the challenges of making more widespread use of genetic information in clinical practice, in particular around ‘incidental findings’. Caroline and I have a joint blog on this paper on Genome Unzipped.
This paper also marks an important watershed in my own career, as it is my first paper in an outright clinical journal. Like many other genomicists and bioinformaticians I have started to interweave my work more tightly with clinical research, as the previously mainly basic research world of molecular biology begins to gravitate towards clinical practice.
Making decisions: metrics and judgement
The conversation around impact factors and the assessment of research outputs, amplified by the recent ‘splash’ boycott by Randy Shekman, is turning my mind to a different aspect of science – and indeed society – and that is the use of metrics.We are becoming better and better at producing metrics: more of the things we do are digitised, and by coordinating what we do more carefully we can ‘instrument’ our lives better. Familiar examples might be monitoring household electricity meters to improve energy consumption, analysing traffic patterns to control traffic flow, or even tracking the movement of people in stores to improve sales.At the workplace it’s more about how many citations we have, how much grant funding we obtain, how many conferences we participate in, how much disk space we use… even how often we tweet. All these things usually have fairly ‘low friction’ instrumentation (with notable exceptions).