Evaluating open data: how do you prove the value?

Open Data - proof of value

WARNING – liveblogging. Prone to error, inaccuracy and howling affronts to grammar and syntax. Posts will be improved over the next 48 hours

Google doc of this session

How do we evaluate the impact of open data – and prove its worth? A debate at Open Data Camp 3 dived deep into the issues – and came up with a few solutions.

Firstly, getting feedback on data sets seems to a real problem. It’s really hard to get feedback on data other than “that address is wrong”.

Sian from the Food Standards Agency would love to know what people are doing – and building – with their data. And it’s not just about proving commercial value, it’s also about persuading other departments and building the case for open data. Can we build up an armoury of cases to persuade people?

Levels of data impact

Impact can be on many levels:

  • social
  • commercial
  • economic

Public health has measure of impact of doing something – or not. Can we do the same for Open Data? We’re in danger of building cases based on anecdotal feedback, so can we make it more systematic? It’s easier to get the public sector to report back on their use of open data, because you can mandate it. How do you apply that to the wider world? The big consultants multiply up from a small, representative sample of users to get the headline figure for impact. Should we take that approach? It’s still a lot of effort – but it gets you some measure of ROI.

Should we challenge the idea that a business model is needed? Open Street Map has probably saved more lives than any other data set. That makes it inherently worthwhile. However – saving lives can be given a value, and built into an argument. What we need is a common set of metrics – if that’s possible. Is it sensible to do that across the public, private and commercial sectors?

One measure is making people if they would mind if you stopped publishing it? That’s a pretty good measure of how engaged people are with open data.

Start with problems, not data sets

There’s a vast difference between a massive spreadsheet you can download, and a RESTful API you can talk to – can you measure the impact of those different formats?

We need to be more glib and superficial than that, says one attendee. You want to start with a problem where one or two data sets contribute value, and then prove that they do. This is just marketing – the halo effect of one or two good stories. A statistical system of measuring impact won’t get you as much – and will tie you up in management. You’ve got to prove the value – and that can just be feel good stories.

So few open data providers are even providing a way for people to tell them how they’re using the data. Give them a channel to give you feedback on their use – and a reason, like a showcase. That’ll give you good stories. You need multiple things:

  • Narrative case studies
  • Hard numbers
  • Images and graphics that look good on slides

These are all tools for engagement engagement.

There is a paradoxical situation where sometimes the people who are getting the most value out of open data are the least likely to want to talk about it – because it’s a valuable competitive advantage.

Data quality
How would we value the index of deprivation? Well, the quality of the dataset. We talk about formats all the time, but that’s nothing to do with quality. So some piece of valuing is about the quality of the data.

Experience design and open data

Could we use experience design techniques around open data? Macmillan Cancer have communities around certain cancers that assess the quality of the information provided. We should use that model. If the Food Standard Agency linked to all the sites that had mapped their data, they’d be providing better value – and making the information more robust in ties of traffic floods. And then they’d free up resources to produce more open data, rather than mapping what they have.

The fear of obsolesence is an incredibly powerful force in keeping the status quo. So you need the innovators – the people who want progress. But you need to reframe the discussion in away that allows more people to engage with it. If people think their jobs are at risk they won’t engage. So you need to persuade them it allows their jobs to get better, not disappear. But it’s also worth thinking about the other end – service users. They don’t have a voice much of the time. Sometimes the users of open data can give them a voice.

Commercial data providers know who their customers are – and they still work to provide case studies, open data needs to learn from this.

A question to ponder

Is data infrastructure, product or service?