Tag Archives: address data

Building an Open Addresses database – and opening its APIs

Warning: Liveblogging – prone to error, inaccuracy, and howling affronts to grammar. This post will be improved over the course of a few days.

Gianfranco Cecconi & James Smith

Open Addresses are trying to build a huge addressing dataset from scratch, fighting the monsters and competitors that involves. They believe that addresses are a key asset of the national information infrastructure – and we need to liberate those addresses – or that was the pitch to the Cabinet Office.

The problem is huge.

They started with the assumption that they could build their dataset from existing open data sets, that (by chance) have associated address information, without intellectual property issues – and a volunteer workforce would then develop it from there. The Royal Mail suggests that there are 60m addresses in the UK – but that’s delivery places. This project has a wider view of the idea of addresses. Your electricity meter or your drone delivery spot might be an address.

Surviving as a non-profit

They also need to survive financially. They try to be frugal – so they try to not get sued, but they also try to build services that can fund what they do. The early money from the Cabinet Office will not last for ever. They have APIs that you can use in your products and services – for free. But there will be value added services on top of that. For example, “give me a likelihood of how real an address is”. It’s not a trivial problem – but could be very useful for delivery services.

There is no UK master list of addresses – no gold standard. Everyone is working to build their database, and all have errors, but some are further ahead. Confirmation is needed on these addresses, and Open Addresses is built to deal with this doubt and uncertainty as they go.

While they do need money to survive, many of their basic services are free, because they need to be there.

Working with the Open Addresses API

The obvious thing: search the data. And that you can do via the API. Just three lines! But the completeness is limited right now – they only have 1.2m of those 60m addresses. You can submit addresses through an API called Sorting Office. Again, free for now. They’ll normalise the address for you – and you can donate it to them, but you don’t have to.

With informed consent from your clients, you can hand over addresses to us on a day to day basis – through Turbot. It’s a platform for managing scrapers, and is descended from ScraperWiki. (It went live last night – 20th February 2015.)

Want to more sophisticated analysis on a block of text with addresses in it? The address building blocks API allow you to perform detailed analysis and processing on that sort of data. That is likely to be the main source of revenues in the battle to survive. The confidence API will be made available, giving a confidence score on any address.

Building the database

Their biggest challenge ahead of them is building the addresses. There’s a privacy issue – and persuading people that sharing addresses is not the same as sharing personal information about yourself doesn’t really tell anything personal. The existence of an address is not personal information, it’s just a fact. You can walk down streets and write them down. But it feels private.

There’s also a corporate approach, working with companies that use addresses, but they need explicit permission from clients to share their addresses.

Further notes and links.