And that’s a wrap. Another Open Data Camp has come to an end. Many thanks to the University of Wolverhampton, all the sponsors, the fabulous camp makers and, of course, the participants, for a fun and informative couple of days. Will there be an Open Data Camp 9? Where will it be? Keep your eyes peeled for the answers. And don’t forget that Pauline Roche has a brand new Open Data Camp LinkedIn page going that you can follow to keep the conversation flowing.
Pauline Roche, a librarian and journalist, who has been coming to Open Data Camp since 2015, suggested this session. “For the first time, we have opened a company page on LinkedIn,” she said. “That’s because Twitter has become… less reliable… and we need another place to gather.
“As of this morning, we have 106 people following. So I wondered how people are using LinkedIn and what we could use it for.”
One participant said they had been on LinkedIn for a long time, but that was because they had a background in financial services. “I always see it as part of that formal, business culture, whereas I see Open Data Camp as more part of the counter-culture. So, I am interested in whether we can get over that.”
Lisa Allen said MenopauseX came out of Women in Data, which was set up to address barriers to women taking up employment.
A high proportion of women drop out in their 50s, as they go through perimenopause or menopause. So, MenopauseX is looking for data that can explain this. “We are looking for a smoking gun,” Lisa said.
“We publish stats for population, and aging, but there are other data sets that might be useful, like how many women go part-time, or drop out? “How can we get data out of companies to help them tackle these issues? What I want to get out of this session is help – how do we do this?”
One participant suggested one challenge is that it can be hard to disentangle issues connected with the menopause from issues associated with mid-life in general – weariness, caring for elderly relatives, careers stalling.
“Ok, so, I have called this session: how do we grow the workforce we need,” said Adam Locker from National Highways. “Because I am tired of the public sector saying we have no money, we can’t pay enough, and we’ll never get the people we need.
“We are all here. So how do we find bright people, and encourage professionalism, and grow the workforce we need?”
Participants felt one place to start would be to look outside traditional sources of recruitment. One suggested that customer service departments are a great source of engaged people who want to solve problems, and who can be taught specific skills, like how to code.
A challenge to this is the DDAT – the Digital Data and Technology Professional Capability Framework – which sets out what competencies people should have to be recruited into certain roles. However, this only addresses some aspects of digital work. And the session argued that if it needs to be changed “we need to own it.”
Alex Ivanov, a data scientist from Faculty, wanted to talk about some of the technology that has been making waves in the press recently.
Usefully, he started by defining a few terms. “LLMs are a subset of AI models,” he said. “They are trained on vast amounts of text data and they can learn the intricacies of human language to do things like answer questions or search databases. At heart, they are trained to predict the next piece of text.
“Generative AI is a wider thing that can create things that are new, including text, and images, and even drugs: they are very broad. So, in any AI, we are talking about a machine learning from data. And the main difference between normal AI and generative AI is the output.
“In traditional AI, we focus on data and classification, to predict things like whether someone will develop diabetes, or even house prices. Wheras with generative AI we create data that was not there already.
“Where open data comes in is that these models are often trained on big datasets, so it can provide the raw material. However, there are certain challenges. One is data quality. If you just pick up lots of data without thinking about its quality that can cause problems.
“Then, there is privacy. Most open data doesn’t identify individuals, but there are some cases where that can happen. You need standardisation to bring all these sources together. Scaleability can be an issue. There are legal issues.
“And we need to think about transparency: some of these AIs are like black boxes, their outputs are almost like magic, so we need to understand what kind of output they are likely to have, and what impact that is likely to make.
“So, I’d like to think about how open data works in this context, and how we address some of these issues around transparency and bias.”
Day one of Open Data Camp 8 finished with drinks at the very fine Great Western Railway pub. And now we’re back at the University of Wolverhampton’s Springfield Campus for day two. Take your seats for another round of pitching and grid development: this unconference will be arriving shortly.
So, following a small incident with a deer leaping over a car parked near the local canal, here comes the outcome of the pitching session.
The climate crisis is the big issue of our times. But how can data be used to get to net zero in time?
One issue is who has datasets that might be useful. The UK and EU are more likely to have useful datasets than countries in the developing world; which makes comparisons between them difficult. Even in the EU, a lot of data will be resticted. Only a few countries, like France, are pushing ahead with open data sets that enable communities to push for change.
Even then, no one dataset will do everything. It might provide an answer to a simple question, like what air quality is like in Wolverhampton. But it won’t provide an answer to a complex question. And climate change is the ultimate complex question. Plus, communities need to know what is available if they are going to use data to apply pressure for change.
Julian Tait, the chief executive of the Open Data Manchester CIC, opened this discussion by saying that “a lot of data comes from a very top-down, managerial perspective.” It “tries to put people in boxes” that “don’t fit their lived experience” and that leads to “poor decision making.”
So, he asked, “how can we as data practitioners make sure data better represents the people we want to serve” and “we don’t get so much sh*t policy.”
Specifically, he said, his organisation is working on a project to tackle violence against women and girls. Often, initiatives in the space focus on better lighting, or more police, when the feedback is this doesn’t work – and those affected might have completely different ideas.
Lea Gorgulu Webb, a data consultant, wanted to talk about water.
Why? Because Ofwat is asking water companies to publish data as open data. So, she asked, what kind of data would be useful? How raw should that data be? How much curation would be useful? What, in short, should Ofwat, as a regulator, be asking water companies to do?
Participants had more useful background.
The ODI has been working with Ofwat on its H2Open strategy. Some companies are on side, and already publishing information on, for example, storm overflows.
However, there are problems.
This session was led by John Kellas, an expert in community development in healthcare, and the “complicated” subject of healthcare, AI and licensing. He asked people to share anything they felt was important, with a view to making recommendations to policy makers.
“In 2017, I helped run a series of webinars on AI in healthcare,” he said, “and on the back of that I was asked to be part of the Academic Health Science Network core AI advisory group and support the development of a national survey on AI in healthcare.
“I was already interested in open data and open source, so I asked for a small question on licensing to be included in this survey. What we found was that about 38% was proprietary, and much less was open source, although there was a lot of ‘don’t want to say’ or ‘don’t know.’
“Since then, we’ve had a £250 million pot for AI in the NHS, and some vague talk about a value return. But I think there is room for something stronger. Because it’s clear that the data for AI is very valuable, and it’s reasonable to think that patients should get some return for it.
“And at the moment, there seems to me to be an issue around whether the NHS is going to procure AI, or develop it, and how we are going to secure that value is not really clear.”