If you’re into data visualizations at all, then you’re going to be familiar with Simon Rogers of the Guardian data blog. They tell incredible stories using data and they’re a leader in the industry for data journalism. I was elated when Simon agreed to be interviewed for a Data Story to talk about his work in data journalism.
1. How does your department find its data and choose which sources to use? It really varies. Sometimes it’s breaking news, such as Hurricane Sandy. That led us to do this map showing every verified event as we felt the raw information was too difficult for most people to find. Sometimes there’s a dataset that’s been released that we feel really needs questioning and investigation. Other times it’s down to a hunch that it might be interesting. After the Denver shootings we did this post looking at gun ownership and homicide rates around the world; so it can be something as serious as that – or as weird as a list of Doctor Who villains (personal obsessions can come into the process…)
2. What do you find first? The stories you want to tell or data that can tell stories?
It’s all about the stories, so I would say it goes that way round. It’s good to start examining a datasource with an idea in mind of what you’re looking for. Otherwise the whole thing just gets too unmanageable.
3. You majored in journalism. How did you end up pursuing data journalism, and what skills did you need to learn along the way?
After 9/11 (which was my second day on the newsdesk) I was told to work with the graphics team to help tell those stories. And I found myself coming back to that role in between editing the science section. During that process, I started getting better at working with spreadsheets and just collecting data, often just make my job easier. You don’t want to keep having to search for Carbon emissions data anew each time you’re doing a story on climate change. In around 2006 or 2007, Adrian Holovaty came and gave a talk at the Guardian to staff and I thought ‘that sounds like a job – and it’s not a long way away from what I’m doing already’. So, when we launched the Datablog in early 2009, it was just a matter of surfacing data we already had. In the meantime, I’ve learnt a load of tools, but mainly work with Excel, Google refine, Fusion tables and free viz tools like tableau and Datawrapper.
4. The Guardian has incredible visualizations. As a data journalist, how do you work with your graphic artists to tell a story?
We can’t each do everything. I can make a map, but it will be miles better if a designer does it and the Guardian has some great graphic designers and a brilliant graphic team. But what I can do is get the right information they need and get it in the right format, and really help with telling the story.
5. The Guardian is trying to be a repository for all open government data? What data do you wish was more readily available?
Basic spending data. It should be easy to find but just getting a total amount that each UK government department spends is always a nightmare involving PDFs. It’s not good having ultra-granular spending figures if we can’t get the totals.
6. You recently told a story about how people are using homophobic language on Twitter (The No Homophobes guide to language on Twitter). What other stories would you like to tell around social data?
That was actually showcasing amazing work on the web – which we do a lot now. I’m fascinated by the way that people use social media and how they use it in conjunction with other media – ie people tweeting while they’re watching TV, for instance. The way that people use Twitter in a crisis is fascinating – and how we share images too.