Using open data to support Welsh speakers

The Welsh government has set ambitious targets to increase the number of Welsh speakers. At the moment, there are perhaps half a million, but by 2050 the government wants to see 1 million.


Session leader Ben Proctor said this presented an interesting open data challenge. “One of the things we have been kicking around in ODI Cardiff that there might be some useful things to do from a data point of view to inform this [target],” he said.

“We have been looking at whether there are existing models for language growth – there probably are, but we can’t find them – and if not can whether we can take some standard growth models and use them.

“We tend to assume that language growth is a straight line, so if we could identify where there are some places where growth is very low, we could say that might be a better place to direct effort than an area where it is very high [and already at the top of the growth chart line].”

The idea immediately prompted a number of questions. Do these Welsh speakers have to be in Wales? What does Welsh speaking mean – being able to work a s a translator, or being able to use it day to day? Is Welsh speaking assessed or self-reported?

Learning from the Basque experience

Once the debate turned to data, participants had some interesting ideas for models. One suggesting looking at the experience of the Basque country, and several suggested looking at the experience of ‘expat’ communities, such as the Welsh speaking community in London.

An expert from the Welsh Assembly pointed out that while the Welsh census asked about Welsh speaking, the English census didn’t – removing a potentially valuable source of data about its prevalence, decline or growth. Participants suggested reintroducing this or running a survey – perhaps “the rugby” would be a good place to start?!

There were other ideas for new data points. For example, one suggested that information might be gleaned from parish and burial records, to take the history of Welsh speaking back beyond the data provided by the more recent Census.

Analysing Welsh Twitter

Another suggested Twitter analysis of Welsh use, or asking the language teaching app Duolingo for information about its Welsh learners. In similar vein, further participants suggested asking other large businesses, such as Microsoft, or British Gas, for information about when their applications or service centres were accessed in Welsh.

It was also noted that there is a push to get Welsh into translation apps and to make it an option when using modern, digital devices – for example via Welsh supporting keyboards. A further suggestion to support the spread and use of Welsh was to map existing data about Welsh speaking, to identify communities in which it is common, and then to target efforts on neighbouring or “edge” communities.

Further ideas can be submitted via an ODI Cardiff Slack channel, details of which can be found @likeaword

Diolch ym fawr!