Responsible data visualisation in our storytelling

Visualisation is a huge part of making data accessible – but do we run the risk of distorting the data by imposing a visual narrative on it? How do we visualise responsibly in our storytelling?

 

How do we communicate ambiguities in data visualisation?

By necessity a story is a narrative you have chosen. Your data can support any number of stories, so you select the data to support your story. Yes, you watch out for bias, but it may be as simple as selection. How do you distinguish between that prerogative as a storyteller and distorting the story.

Here’s an idea: the Twitter test. If someone retweeted the image without context, would you be comfortable with that. If you have taken the steps to label ambiguities and sources, that’s probably all you can do.

Also, you can give people the original data – that’s the joy of open data! It allows them to interrogate and check your story.

One attendee who had worked on visualisation for a major newspaper suggested that tunnel vision can become a problem, and that doing user testing on the visualisation early helps deal with that problem.

Where you have data missing, you can make the story about what isn’t being disclosed or shared.

How do we keep away our own bias?

There are two issues: mistakes and bias. And that’s true through the whole lifecycle of the data. Data itself isn’t “biased”, but its collectors could have been. The purpose the original data set was collected for may be very different from what it’s eventually used for. So, while the data might be neutral, an algorithm looking for specific outcomes can find it based on the collection bias.

Weapons of Math Destruction is a great book on subject like this.

There are two worlds of storytelling here:

  1. I have a story, let’s find some data to back it up
  2. I have some data – let’s find some stories in it

It’s inevitable that we start forming hypotheses about data we look at. And then we look for things that support that – it’s human nature. You can counter that to some degree by working with others. But it is difficult – exhausting, even. We’re fighting our own biases.

We find it difficult to relate to people with very different narratives about the same basic information. We don’t get enough training in counterfactual thinking. Can you put yourself in the mind of somebody with a completely different view of the data? That training needs to start at school.

Debating the data

“Bring the art of debate to a data set.”

Part of the beauty of open data is that two groups of people with very different views can work on the same data.

Visualisations can be an analytic tool as well as a storytelling one – by visualising sometimes the story leaps out at you.

The science community is good as this: you present your research, as well as your data and methodology. And then your work can be replicated, or disproved.

True. From a certain point of view.

But are we getting hung up on truth? Because there many different versions of truth, and open data often supports that. There is a lie to be walked between telling a story and entertainment.

It’s easier to reach truth with small datasets – a fact. That’s not necessarily good or useful, though. You can make choices from complicated datasets that facilitate a balance between truth and storytelling. And, philosophically, there are multiple truths, and we need to respect that in our storytelling – don’t make more claims than the data truly supports, and note complexity. Visualisations are just a perspective on the data.

An obvious example of distorted visualisations: make a 3D pie chart and tilt it away from you, and the bit nearest you looks disproportionally big. Pie charts are fundamentally wrong – the geometry of the circle creates problems. People are really bad at judging area.

Session notes