A few weeks ago, we brought together a group of Deputy Directors, Service Owners and digital leaders from across the civil service for a Chatham House roundtable on AI in public services. The conversation ran for over an hour and covered everything from governance to climate impact to workforce confidence. Despite coming from different departments and very different delivery environments, everyone kept returning to the same theme.
Whether you have a genuine use case or have been told to ‘just use an LLM’, building something with AI is the easiest part of the whole journey. Everything around it is where the difficulty lies.
This is the first of two pieces capturing the shared insights from that discussion.
Pilots are simple. Reality is not
Almost every team described how quickly they had got a pilot up and running. Some had stood things up in days. One person put it plainly:
“We asked forgiveness, not permission. Once we had something working, we moved through the risk process more quickly.”
The technology came together quickly. The complications arrived afterwards.
People spoke about the moment a prototype touches the operational world: policy expectations, legal obligations, unions, staff confidence, accessibility, climate impact, and the details of how real services work. Those things don’t show up in the demo, but they appear immediately when you try to use something with real users.
Another participant reflected on the structures wrapped around delivery:
“A lot of governance feels like stage gates to get to the next thing.”
Others didn’t have that option at all. One department said,
“We didn’t have a choice. We had to go with a low-burden governance approach.”
In both cases, the pattern was the same. The easy bit was building the thing. The difficult bit was understanding what it meant for the organisation.
The work that actually ends up slowing people down
1. Business change
AI tools shift how work gets done, and that can unsettle teams who have built their expertise over years.
People want to know what it means for them, not just what the tool does. Several participants mentioned the emotional side of adoption, especially for staff whose work is closely tied to professional judgement.
2. Workforce confidence
A topic that came up repeatedly was fear of “getting it wrong”. As one person said,
“There’s a huge proportion of fearful or reticent people.”
Another added that people are far more relaxed experimenting personally than professionally:
“They’re not scared of using AI at home. They’re scared when it affects professional judgment.”
That shift from personal curiosity to professional confidence is a significant hurdle.
3. Governance that can’t keep pace
The group had tried very different approaches. Some embedded governance from the start. Others tested quickly and tried to bring governance along afterwards.
But everyone agreed that existing frameworks struggle with the pace and behaviour of AI systems.
One comment captured this well:
“Governance is driven by future regret minimisation.”
That instinct is understandable, but it can also slow things down in ways that don’t map neatly to how AI services are built.
4. Content, structure and accessibility
Teams discovered unexpected issues once real users got involved. A participant working with face capture described it clearly:
“Taking a photo had loads of accessibility issues. Some of them you just can’t fix.”
Their team documented everything in the accessibility statement to be transparent.
Several other teams had similar stories about content structure, information architecture and the role they play in how AI behaves. The more complex the environment, the more important the basics become.
5. Support and lifecycle
One comment raised an important question many teams hadn’t yet faced:
“Once something is out there, how do you turn it off?”
The group talked about pilots that grew faster than expected, and the sudden need for support models, monitoring, ownership and ways of managing things that had moved beyond the “experiment” stage.
6. Sustainability
A smaller but important thread focused on climate impact. One team said,
“We want users to see the footprint when they interact with generative AI.”
Others admitted they didn’t yet have the data to quantify usage, but knew it would matter in future.
The pattern behind all of this
Nobody at the roundtable struggled with the models themselves. They struggled with the organisational, behavioural and structural realities that sit around the models.
The things that make services work. The things that pilots are allowed to ignore.
Part two of this series will summarise the practical approaches teams are taking to build more stable, responsible and usable AI-enabled services.