How do you get started with SPARQL, the language for querying linked data? An Open Data Camp 7 session, led by Jen, aimed to help newbies get going.
Liveblogging: prone to error, inaccuracy, and howling crimes against grammar and syntax. Post will be updated in the coming days.
More and more open data platforms are either becoming linked data at their core, or they have offshoots that add it. The data underneath linked data is RDF – and SPARQL is the query language for RDF. Most SparQL endpoint look like a query box with gobbledegook with them – where you are expected to write your own gobbledegook. It’s somewhat intimidating,
In most cases, they also provide an API so you can programmatically query the information – but somebody needs to develop that. SPARQL endpoints give you direct access to all the data. The structure of RDF — the triples — creates a very standardised data format that you can query for whatever you like.
You can use the query interface to hone down on the data you want, and then download it as a CSV, or use that as a query to use programatically. The playgrounds help you figure out how to construct queries by showing you the results on a sample dataset.
The ONS, WikiData and the Scottish Government all offer SPARQL endpoints to their linked data. The structure of the triples allows very small, discrete ways of describing the characteristics of something, but still make large datasets feasible. You can merge data easily if they both respect the fundamental structure.
It is tricky to dive straight into SparQL, unless you understand some of the structure of RDF.
— 𝙰𝚍𝚊𝚖 𝚎𝚕 ⁉️ (@chairlord) November 2, 2019
SPARQL mainly acts like a pattern-matching language. You can call variables whatever you want. You use predicates to return the identifiers based on the query you constructed. They, in of themselves, don’t give much information. So, you can ask for all predicates and objects connected to your search. That gives you (say) the people and the information they have about them.
You can even construct federated queries, which interrogate more than one endpoint across the web. For example, you could query both a Japanese government database of public art, and WikiData, to return information about a particular artists’ works.
Terence Eden has written an absolute beginner’s guide to SparQL.