An Open Data Camp 7 session on data visualisation, led by Ian Makgill. These are live-blogged notes.
There is a lot of temptation to use really exciting visualisations. But 90% of the time, you end up with bar or line charts – because they work. If you have more than 20 data points along the x axis, you probably want a line chart, not a bar chart.
Continue reading Data Visualisation: making it work →
Just a few days after Halloween, and with pumpkins adorning the refreshment tables at Open Data Camp 7, campers gathered at the end of day two to swap open data horror stories. Or, as leader Dan Barrett put it, to learn from their experiences and mistakes. Because that can be cathartic — and helpful for others.
A reflection on working at [a large public institution] and spending six years trying to improve its open data division. “I recognised that there was a division between its work and public understanding of what it did. And I thought open data could help to bridge that.” Things were going fairly well. “And then they went spectacularly badly, and the work stopped.”
What did the teller learn? “That it is important to own the story of your own work, and to think about how you tell it to other people,” particularly in an environment in which others are seeking to benefit from telling a counter-narrative, “discounting the work you do, playing down the benefits of what you do”, and diverting resources to other priorities. “So that is the lesson I am taking into a new role: Tell stories that resonate with everybody about data.”
Continue reading ODCamp 7: Horror stories… →
A session on rescuing usable data supplied in PDFs, led by Martin.
A client of one of the session participants needed an automated process to check which PDFs had changed data in them – and which didn’t. They had been doing it manually. However, a computational solution isn’t as easy as it looks. For example, software often finds it hard to spot a table. It’s relatively easy to extract data from a table in a PDF, if it looks clearly like a table – borders around “cells”. However, many tables in PDFs are clear to humans – but not to computers. Extracting those sorts of tables is much more tricky.
Continue reading Extracting Open Data from PDFs in usable formats →
A Open Data Camp 7 session on registers, led by Andy Bennet of registers.app.
At the end of 2015, there was a project in the Government Digital Service about the structure of data. There was open.gov.uk, where the data was quite unstructured. The consumer had to wrangle it into the form they needed. In the legalisation, there were hundreds of thousands of mentions of registers – datasets that different departments and minsters needed to keep. The idea was to publish these registers of things government knows.
One core principle: these are owned and maintained registers. This makes them about governance – about making sure that there are people in positions of power with responsibility for them. You can’t spread the decision-making around – it has to be a named individual. There’s been some work done by the Open Data Institute in the last year about collaborative ownership models.
Continue reading Registers: why they matter and how to save them →
A session on using open data in artistic works of various sources, led by Leela Collins.
Traditionally, we have infographics, where we take data and visualise it so people can understand it. And then there’s conceptual art, which gains some of its meaning from the original data source. Does that create a new work, or does it owe something to the data producer?
Data is becoming a tool, in the same way that brushes are.
And then there’s protest art, where the whole of the data is used to create the art. But if the data is licensed non-commercially, can the artist make money from the work? A full open data licence is free for reuse. However, a non-commercial licence on some data is somewhat ambiguous – is it just restricting resale of the data itself, or does it prevent it being used for anything commercial?
Continue reading Data Art: what are the limits and opportunities in data licensing for artists? →
Day two of Open Data Camp 7 at Geovation in London started with a session on public sector procurement data, and how it could be used to encourage green initiatives. Ian Makgill introduced the session. His company has a site that captures public tender information and makes it “freely available to everyone” and then analyses the data to say “oh look, this is how much work this company has got” or “here’s a trend in a particular kind of spending.”
However, he said, while this was interesting, it wasn’t having a big impact on organisational behaviour. But: “What we realised is that suppliers are very interested in when contracts are coming to an end. That’s understandable, but it’s also a massive leverage point at which the public could encourage procurement that reduces carbon.”
After all, government spends around £12.9 billion a year on things, and those things are responsible for about 17% of carbon output, because they are things like roads, and airports. So there should be an opportunity for experts and the public to get in and argue that setting a contract in a different way will induce change.
Continue reading ODCamp 7: Going green(er) with open procurement data →
The day has dawned bright and sunny on Open Data Camp 7’s final day. There’s a greta bunch of people present, the coffee is flowing, and it’s time to pitch. Here’s what’s on today’s menu:
Continue reading Open Data Camp 7: Day Two pitches →
An Open Data Camp session on helping charities and other low tech bodies create data ecosystem stories improve their impact, led by Pauline Roche.
Liveblogging: prone to error, inaccuracy and howling crimes against grammar and syntax. Post will be updated in the coming days.
Over 80% of charities in this country operate on tiny budgets – often under £10,000 per annum. There are some similarities with, say, libraries, or arts bodies. There are resources out there for them – like 360giving – but they may not know about them, or have the confidence to use them.
Datakind offers a number of resources. They recently worked with the GLA, to help understand the number of refugees and migrants in London. There isn’t good data out there on that. But charities tend to know where they are – so could they provide that information. So they asked – and it would be fair to say that they weren’t keen on the idea. They said that, if they were going to do this, they needed support in working out what to collect, and how. And the GLA was willing to help take that on.
Many of the charities had no idea of the data already available that they could use, nor how data could help their own work. They paired up data experts with subject experts to figure out what was needed, and how to deliver that data.
Continue reading Building a data ecosystem in a low tech envronment →
One of the final sessions of the first day of Open Data Camp 7 in London was led by Anneka France from The Rivers Trust. She she had wanted to run the session because she had wanted to get hold of the National Soil Map because she wanted money for an EU-funded project to restore peatland for climate mitigation and flood prevention.
The National Soil Map is covered by a commercial licence, and the charity was quoted £25,000 to get the data it needs. Which it can’t afford. But then Anneka heard about the ‘pillars of power approach’ “which has been used to overthrow governments” and wondered if it could help. Continue reading ODCamp 7: Using a pillars of power approach to opening data →
How do you get started with SPARQL, the language for querying linked data? An Open Data Camp 7 session, led by Jen, aimed to help newbies get going.
Liveblogging: prone to error, inaccuracy, and howling crimes against grammar and syntax. Post will be updated in the coming days.
More and more open data platforms are either becoming linked data at their core, or they have offshoots that add it. The data underneath linked data is RDF – and SPARQL is the query language for RDF. Most SparQL endpoint look like a query box with gobbledegook with them – where you are expected to write your own gobbledegook. It’s somewhat intimidating,
In most cases, they also provide an API so you can programmatically query the information – but somebody needs to develop that. SPARQL endpoints give you direct access to all the data. The structure of RDF — the triples — creates a very standardised data format that you can query for whatever you like.
There’s a SPARQL playground where you can experiment with queries. There’s more than one of them, in fact.
You can use the query interface to hone down on the data you want, and then download it as a CSV, or use that as a query to use programatically. The playgrounds help you figure out how to construct queries by showing you the results on a sample dataset. Continue reading SPARQL 101: how to get started with the linked data search query language →