The Seduction of the Demo (3 of 4)

Why impressive prototypes rarely translate into lasting scale.

Aug 26, 2025

This post is part of a four-part series: Thinking Clearly About AI. The series doesn’t try to predict the future. It looks at patterns from past disruptions and asks better questions about the present. Each post explores one idea.

Demos are seductive.

I once saw a demo of an AI voice agent called Maya. It spoke so naturally that executives in the room thought it was a person. They were stunned. Everyone wanted it deployed immediately.

Six months later, the project was stuck. Compliance demanded more testing. IT couldn’t integrate it with legacy systems. Employees didn’t trust it. The demo was flawless. The scale was impossible.

This is the trap: demos show possibility without friction. Reality is friction.

Case Study: Microsoft Copilot

In 2023, Microsoft rolled out Copilot, an AI assistant embedded in Word, Excel, Outlook and Teams. The demos were dazzling: instant slide decks, emails drafted in seconds, pivot tables generated from natural language. Executives saw productivity gains everywhere.

The early results are mixed. Microsoft’s own research (2024) shows:

90% of users who tried Copilot reported productivity gains.
But only 15% of enterprise employees used it weekly.
In some firms, licenses were bought at scale but usage remained under 10%.

Why? Three reasons:

Cultural inertia: Employees default to old habits, even when new tools are better.
Governance delays: Legal and compliance departments hold back deployment until risk frameworks are clear.
Integration gaps: Copilot is powerful, but if your data is siloed, the AI can’t deliver useful insights.

When Copilot works, it changes workflows. Teams can move from manual reporting to real-time decision support. But most organisations haven’t redesigned how work gets done. They’ve layered Copilot onto existing processes.

The risk is the same as we saw with earlier tools: technology outpaces adoption. Data lakes promised “the segment of one” in marketing. Billions were spent. Most firms still blast generic campaigns. Why? The model didn’t change.

The lesson: Productivity demos are easy. Redesigning organisational models to capture the value is hard.

Case Study: NHS

Healthcare AI has delivered some of the most impressive demos. Algorithms can read X-rays and MRIs with accuracy on par with radiologists. In 2019, Google’s DeepMind published results showing AI could outperform radiologists in detecting breast cancer on mammograms (Nature, 2020). The NHS partnered with multiple firms to test these systems.

Five years later, most UK hospitals still rely primarily on human radiologists. A 2023 report from the UK House of Lords noted that despite promising pilots, scaling AI in the NHS faced hurdles:

Integration challenges with hospital IT systems.
Regulatory delays around liability and patient safety.
Workforce resistance, radiologists feared replacement, so adoption slowed.

Some pilots, like Babylon Health’s AI triage chatbot, ended in collapse, with the company going bankrupt in 2023 after overpromising and underdelivering.

The NHS’s difficulty highlights the core problem: layering AI onto broken systems doesn’t work. Diagnostic AI isn’t valuable if workflows, liability frameworks and staffing models don’t adapt.

Mayo Clinic’s infection-detection AI (see post 2) works better because they redesigned the pathway: patients submit photos, AI triages, only at-risk cases hit clinicians. That’s a model shift.

The NHS mostly tried to drop AI into existing workflows. That’s why demos remained demos.

Case Study: Smart Cities and IoT

A decade ago, “smart city” demos were everywhere. IoT sensors showed traffic lights optimising flow, bins signalling when they were full and energy systems balancing demand in real time. Consultants promised urban utopias.

Billions were spent on pilots in Barcelona, Singapore, Songdo in South Korea. The results? Fragmented at best.

Barcelona’s projects delivered marginal efficiency gains but stalled at scale because of funding gaps and political turnover.
Songdo, a $40 billion “built-from-scratch” smart city, ended up half empty, its smart systems underused.
In the US, Sidewalk Labs (Google’s smart city arm) abandoned its Toronto project in 2020 after regulatory backlash and citizen privacy concerns.

The technology worked. The governance and funding didn’t. IoT sensors and platforms could do amazing things in demos, but cities couldn’t align incentives, budgets and accountability.

Sound familiar? That’s where many AI enterprise projects are headed if leaders focus on tools without redesigning systems.

Why Scale Fails

The Copilot, NHS and smart city stories point to the same dynamics:

Culture: Employees don’t trust, don’t change habits or don’t see incentives.
Structure: Legacy processes, compliance barriers, siloed data.
Governance: No clarity on liability, ethics or accountability.
Ecosystem: Demos exist in isolation; scale requires integration across messy, real systems.

This is why scale is harder than demos. Technology is the easy part. Organisations are the hard part.

AI is following the same script. The danger is not that it won’t work. The danger is that leaders confuse demos with transformation, then lose patience when adoption lags.

The Question for You

So when you see the next demo (whether it’s a Copilot, a triage bot or an agent workflow) pause and ask:

What percentage of our workforce will actually use this weekly?
What workflows must change for this to deliver value?
What governance is required before regulators intervene?
Who is accountable if it fails?

Until you have answers to those, you don’t have transformation. You have a prototype.

Human and Machine

Discussion about this post

Ready for more?