An Open Data Camp 7 session on data visualisation, led by Ian Makgill. These are live-blogged notes.
There is a lot of temptation to use really exciting visualisations. But 90% of the time, you end up with bar or line charts – because they work. If you have more than 20 data points along the x axis, you probably want a line chart, not a bar chart.
We’re celebrating the tenth anniversary of @instituteforgov this week, which provides a great opportunity to look back on everything we’ve done with data down the years.
Yes, thread.
(A long one, forgive me this once…!)
(1/38)#IfG10 #dataviz pic.twitter.com/aCmz1RfbLS
— Gavin Freeguard (@GavinFreeguard) June 12, 2019
Most people don’t get on with violin plots, but because the variables are small it makes sense. If the data becomes too complex, it stops working. Sometimes you won’t know if it is going to work until you try it. The technical piece is less difficult than the feedback loop with the users: is it answering their questions? Do they understand it?
Data science is about storytelling. What is going to leave an impression in your viewer’s mind? The Institute for Government has built a set of templates – 99.5% of the charts they build are in Microsoft Excel.
Mosaic charts have been used by the Institute for Government to show the size of the parties in the House of Commons. It works well. There are 650 boxes, each representing a seat. You pull out those who don’t vote – the speaker and deputy speakers, and Sinn Fein – and then you can see the composition of parliament really clearly.
That said, sometimes line charts are the more powerful, like the prison violence chart they produced:
A few people have started sonifying data. Here’s an example:
Is is possible to listen to a chart? Tune into Inside Briefing to hear @GavinFreeguard on data sonification.
Also this week, our experts discuss Parliament’s wake-up call for govt, the future of taxation and we speak to former Cabinet minister @DLidington https://t.co/6vO9HTzGdo pic.twitter.com/SPb3aaBGFX
— Institute for Gov (@instituteforgov) November 3, 2019
They get feedback via user testing and sometimes vigorous feedback from colleagues. Even things like numbers of retweets can be an indicator of success.
They’ve been surprised how much they can do with Excel. Between their own experiments, and suggestions they found via Google and YouTube, it’s been a surprisingly useful tool, especially for encouraging less specialist colleagues to use visualisation. It’s not perfect, but it can stop you from getting over-complicated. You are trying to tell a message. But please, move away from the standard styles – people will treat your data as less authoritative if you use templates they recognise.
Star Wars Episode III: Revenge of the Sith is an example of bad data visualisation: all that CGI obscure story and character.
Star Wars: Revenge of the Sith is a compelling example of bad data visualtisation… #ODCamp pic.twitter.com/dAoUpCio01
— Adam Tinworth (@adders) November 3, 2019
Tools
Start with pen and paper. Really. Think on paper, because it’s quick and cheap and a great way of testing ideas. Raw Data: Infographic Designers’ Sketchbooks is worth a read.
Sometimes, though, throwing data into a visualisation tool is a necessary approach – scatter plots can allow you to spot things you wouldn’t have otherwise, for example.
Flourish is a tool worth looking at.
PowerVI is in use by many – but if you’re going to use it, play and play and play as soon as you get the tool, explore all the options, get it out of your system, because you will end up using bar charts most of the time. There’s a Venn of statisticians and coders who love very complicated charts. Nobody else does.
Dashboards – think carefully about how you design them. They MUST work on phones. They often work well if you give every visualisation its own, identical discreet space.
Word clouds can be interesting – but you have to choose the right texts. But beware of pie charts: they’re great for a limited umber of data points and a clear story. But that’s rare.
Another thing to be treated with care: excluding data to make the story stand out more clearly. You’re making choices for people. They might not appreciate that, or see that as manipulation.
Visualising uncertainty isn’t straightforward – but you can. The ONS used “fuzzy” lines on migration figures to show the degree of uncertainty in the figures.
- The Urban Institute Date Visualisation Style Guide is incredibly useful.
- There’s a Python library called Altair – it integrates well with Jupyter Notebooks. It’s great for testing visualisations before they go on the website.