UK's grand plan to fuel AI with public data faces uphill battle
- Reference: 1775637013
- News link: https://www.theregister.co.uk/2026/04/08/national_data_library_plan/
- Source link:
With misleading titles and non-existent metadata, the data currently available cannot support any meaningful analysis, a study from the Open Data Institute (ODI) found.
In the Autumn Budget of 2024, the government confirmed plans for the NDL, promising researchers and businesses "powerful insights that will drive growth and transform people's quality of life through better public services and cutting‑edge innovation, including AI." In January, it published [1]an update , saying the plan was backed by a £100 million investment as part of £1.9 billion being provided to the Department for Science, Innovation and Technology (DSIT) through 2028/29.
[2]
DSIT said it had completed an extensive discovery phase to map out "the biggest opportunities and priorities" and "test approaches to systemic reform" across the public sector.
[3]
[4]
However, the ODI has published an "NDL-Lite" prototype, with access to more than 100,000 public datasets. It found some of the datasets – particularly on data.gov.uk – are badly labelled, out of date, or effectively invisible to AI tools. When authoritative data is hard to access, AI systems turn to other sources, such as news reports or commercial data, which do not always give accurate information, the ODI warned.
The prototype gathered 38 GB of data from six public sector sources, processing and standardizing more than 100,000 files into a single resource. While the study showed the NDL could be built at relatively low cost, it also highlighted the work needed to make the data AI-ready.
[5]
The study found that even broad terms such as "crime" were difficult to analyze or track properly. Some datasets with that label were local authority statistical releases that could not be combined because of a lack of shared standards. National datasets were also outdated or inaccessible. One major Home Office crime dataset has not been updated since 2018. Although there is an updated version, it cannot be accessed via the API provided by the Office for National Statistics (ONS).
[6]GOV.UK chatbot gets smarter but slower as LLMs improve
[7]AI chatbots waffle on GOV.UK queries, then get facts wrong when told to zip it
[8]Irony alert: Anthropic helps UK.gov to build chatbot for job seekers
[9]How the ONS data-sharing dream ended in budget cuts and three rival platforms
Professor Elena Simperl, director of research at the ODI, told The Register that the findings highlight a growing gap between the volume of public data available and its practical usability.
"For crime statistics, the AI agents then went and tried to find crime statistics from somewhere else. If you don't update your data, if your metadata is not good quality and has lots of missing values, we could see from our experiments with the AI agent we built that they would just circumvent the available data. It would go elsewhere on social media and other places to try to find that information in a report somewhere, because it's much easier for them," she said.
"The government's National Data Library has huge potential, but much of the data it would rely on is not yet usable by modern AI systems. If that doesn't change, there is a risk that AI tools will increasingly rely on sources that are easier to access, rather than those that are most reliable."
A government spokesperson told us it wants to "maximise the benefits of public sector data" in a bid to make services "more efficient and grow the economy."
[10]
"Reflecting these findings, we're already overhauling the UK's digital public infrastructure through our [11]Roadmap for Modern Digital Government .
"That includes building new infrastructure like the National Data Library in a way that ensures public sector data is shared and used more easily, upgrades to outdated systems and putting new guidance in place for the safe and ethical use of public data."
The National Data Library is the latest project designed to help researchers and data scientists find all the publicly held data they need. Launched in 2004, the Secure Research Service (SRS) offers curated, research-ready datasets to accredited researchers.
In 2020, the government planned to replace this system with the Integrated Data Service (IDS) from the ONS. However, some of its budget of £240.8 million was used – with approval from His Majesty's Treasury – to fund more general tech and data costs as the ONS struggled to get off legacy IT systems. Funding for the IDS [12]was effectively cut in March , although existing services will continue to be available, largely within the ONS, missing one of the major objectives.
The NDL is the new plan for national data sharing to support research, machine learning, and AI. ODI's study shows the work needed to avoid being another missed opportunity. ®
Get our [13]Tech Resources
[1] https://www.gov.uk/government/publications/national-data-library-progress-update-january-2026/national-data-library-progress-update-january-2026
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2adYnQZqC3WasVW-1r76Z7gAAAAo&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44adYnQZqC3WasVW-1r76Z7gAAAAo&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33adYnQZqC3WasVW-1r76Z7gAAAAo&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44adYnQZqC3WasVW-1r76Z7gAAAAo&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://www.theregister.com/2026/03/19/govuk_chatbot_accuracy/
[7] https://www.theregister.com/2026/02/19/chatbots_too_chatty_government/
[8] https://www.theregister.com/2026/01/29/irony_alert_anthropic_helps_ukgov/
[9] https://www.theregister.com/2025/10/03/ons_data_sharing_mess/
[10] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33adYnQZqC3WasVW-1r76Z7gAAAAo&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[11] https://roadmap-for-modern-digital-government.campaign.gov.uk/digital-and-data-infrastructure/
[12] https://www.theregister.com/2025/10/03/ons_data_sharing_mess/
[13] https://whitepapers.theregister.com/
Shock horror the AI won't fix
AI really isn't as smart as the smack talk and hype would lead you to believe.
It's got some uses I'll concede, but not the silver bullet that would justify the power and resources it's consuming.
Definitely shock horror given “ misleading titles and non-existent metadata, the data currently available cannot support any meaningful analysis” is.a good description of unstructured data, which is exactly what all the current mass market LLMs have been trained on…
The shock horror is how so many believe: garbage in, intelligent insights out…
National Data Library?
Hell No.... [see Icon]
Shock horror the AI won't fix the decades of mess so can't get accurate info. Starting to feel like it was a good idea not making things work well. Only thing that'll save us from the AI is the poor quality of data accessibility... Thanks years beaurocatic mis-alignment.
I often find the latest benefits always assume you've implemented the previous tower of fads. I have yet to see this anywhere. So wonder who ever sees these benefits they tout...