Data Visualisation: making it work

An Open Data Camp 7 session on data visualisation, led by Ian Makgill. These are live-blogged notes.

Drawnalism: data visualisation

There is a lot of temptation to use really exciting visualisations. But 90% of the time, you end up with bar or line charts – because they work. If you have more than 20 data points along the x axis, you probably want a line chart, not a bar chart.

 

 

 

Most people don’t get on with violin plots, but because the variables are small it makes sense. If the data becomes too complex, it stops working. Sometimes you won’t know if it is going to work until you try it. The technical piece is less difficult than the feedback loop with the users: is it answering their questions? Do they understand it?

Data science is about storytelling. What is going to leave an impression in your viewer’s mind? The Institute for Government has built a set of templates – 99.5% of the charts they build are in Microsoft Excel.

Ian leads the data visualisation session

Mosaic charts have been used by the Institute for Government to show the size of the parties in the House of Commons. It works well. There are 650 boxes, each representing a seat. You pull out those who don’t vote – the speaker and deputy speakers, and Sinn Fein – and then you can see the composition of parliament really clearly.

That said, sometimes line charts are the more powerful, like the prison violence chart they produced:

Prisons assaults v2

A few people have started sonifying data. Here’s an example:

They get feedback via user testing and sometimes vigorous feedback from colleagues. Even things like numbers of retweets can be an indicator of success.

They’ve been surprised how much they can do with Excel. Between their own experiments, and suggestions they found via Google and YouTube, it’s been a surprisingly useful tool, especially for encouraging less specialist colleagues to use visualisation. It’s not perfect, but it can stop you from getting over-complicated. You are trying to tell a message. But please, move away from the standard styles – people will treat your data as less authoritative if you use templates they recognise.

Star Wars Episode III: Revenge of the Sith is an example of bad data visualisation: all that CGI obscure story and character.

Tools

Leela holds forth at Open Data Camp

Start with pen and paper. Really. Think on paper, because it’s quick and cheap and a great way of testing ideas. Raw Data: Infographic Designers’ Sketchbooks is worth a read.

Sometimes, though, throwing data into a visualisation tool is a necessary approach – scatter plots can allow you to spot things you wouldn’t have otherwise, for example.

Flourish is a tool worth looking at.

PowerVI is in use by many – but if you’re going to use it, play and play and play as soon as you get the tool, explore all the options, get it out of your system, because you will end up using bar charts most of the time. There’s a Venn of statisticians and coders who love very complicated charts. Nobody else does.

Dashboards – think carefully about how you design them. They MUST work on phones. They often work well if you give every visualisation its own, identical discreet space.

Word clouds can be interesting – but you have to choose the right texts. But beware of pie charts: they’re great for a limited umber of data points and a clear story. But that’s rare.

Another thing to be treated with care: excluding data to make the story stand out more clearly. You’re making choices for people. They might not appreciate that, or see that as manipulation.

Visualising uncertainty isn’t straightforward – but you can. The ONS used “fuzzy” lines on migration figures to show the degree of uncertainty in the figures.

[Session Notes]