top of page

The AI Bill Is Coming Due: What UK Small Businesses Should Know About Rising AI Costs

For the past few years, many AI tools have felt surprisingly affordable.


Pay a monthly fee. Ask questions. Generate ideas. Summarise documents. Write code. Try again. Try again differently. Try again with “make it punchier”. No visible meter ticking away in the corner.


That era may not be ending overnight, but it is changing.


Some AI tools are moving away from simple flat-rate access and toward usage-based, token-based, or credit-based pricing. That matters because the way businesses use AI is also changing. We are no longer just asking chatbots one-off questions. We are starting to use AI agents, coding assistants, long-document analysis, workflow automation, and tools that can run multi-step tasks in the background.


In plain English: the cleverer the AI workflow becomes, the easier it is to lose track of what it costs.



The pricing shift is already visible


A clear example is GitHub Copilot. GitHub has announced that from 1 June 2026, all Copilot plans will move to usage-based billing using GitHub AI Credits. Usage will be calculated from token consumption, including input, output, and cached tokens. GitHub says this change is needed because Copilot has evolved from a simple coding assistant into an agentic platform capable of long, multi-step coding sessions, which creates much higher compute and inference demand.


That wording is important. GitHub is not saying AI has become useless or unaffordable. It is saying the old pricing model no longer fits the way people now use the tool.


This is the wider pattern small businesses should notice. AI pricing is becoming more closely linked to actual usage. A quick question and a long automated workflow are not the same thing, even if they used to feel similar from the customer’s point of view.


Anthropic’s Claude pricing is also token-based, with different prices depending on the model used. More powerful models cost more than smaller, faster models. That is normal for AI APIs, but it means model choice becomes a business decision, not just a technical detail.


OpenAI’s API pricing follows the same broad logic: usage is priced by model and by tokens, with different costs for input, cached input, and output.


Google Gemini shows the same picture from a slightly different angle. Google’s Gemini API still includes a free tier for developers and small projects, while paid users get higher rate limits, access to context caching, batch API options, and Google’s more advanced models. Google’s own pricing page also makes clear that paid Gemini usage is priced by tokens, with different rates depending on the model and mode used


The direction of travel is clear: if your business starts using AI heavily, the question is no longer just “Which tool should we use?” It is also “What are we actually asking it to do, how often, and with how much data?”




Why token costs can creep up


A token is a small chunk of text. It might be part of a word, a full word, or a bit of punctuation. Every prompt you send to an AI model uses tokens. Every answer it gives back uses tokens. If the model is given a long document, previous chat history, tool instructions, or background context, those count too.


For a normal user asking occasional questions, this is usually not a problem.


The cost issue appears when AI is used repeatedly, automatically, or with large amounts of context. For example, an AI agent working through a coding task may not just answer once. It may inspect files, generate a plan, write code, run checks, read errors, retry, update its approach, and continue.


Each step adds more context. The agent has to keep re-reading what has already happened so it can decide what to do next. That creates a compounding effect.


A recent paper on agentic coding tasks found that AI agents can consume around 1,000 times more tokens than simpler code chat or code reasoning tasks. It also found that token usage can vary widely between runs of the same task, meaning cost is not always easy to predict in advance.


That does not mean AI agents are bad. It means they need boundaries.


Without those boundaries, a business can end up with the AI equivalent of leaving the heating on, the windows open, and then wondering why the bill looks like a minor plot twist.



This does not mean cloud AI is suddenly wrong


It would be easy to turn this into a simple argument: cloud AI is expensive, so everyone should run AI locally.


That would be too neat. And wrong.


Cloud AI still has major advantages. It gives access to the most capable models, regular updates, strong integrations, and tools that work without needing to manage hardware. For many small businesses, cloud AI will remain the best default option.


If you use AI occasionally, or mainly for writing, brainstorming, customer service drafts, marketing ideas, spreadsheet explanations, or general productivity support, a cloud tool may still be the simplest and most cost-effective route.


The risk comes when cloud AI quietly becomes part of core operations without anyone checking the usage model.


That is especially true for AI coding assistants, automated research workflows, customer support bots, document processing, or anything that runs repeatedly in the background.



Where local AI comes back into the conversation


Local AI means running AI models on your own device, rather than sending every prompt to a cloud provider.


It has obvious limitations. You need suitable hardware. Setup can be fiddly. Local models may not match the best cloud models for complex reasoning, creative work, or difficult analysis. Someone has to manage updates and troubleshooting.


But local AI also has real advantages. Once the hardware is in place, there is no per-token charge for every prompt. Sensitive data can stay on your machine. You are less exposed to sudden pricing changes, outages, or usage restrictions from a provider.


That makes local AI particularly interesting for predictable, repetitive, or privacy-sensitive work.


For example, a small business might use local AI for first-draft summaries, internal document triage, simple classification, data-cleaning notes, or private experimentation. It might still use cloud AI for final-quality writing, complex strategy, client-facing outputs, or tasks that need the most capable model available.


That is where the hybrid model starts to make sense.



The smarter answer is hybrid AI


The best question is not “cloud or local?”


It is “which work belongs where?”


A practical small business AI setup might look like this:


Cloud AI for complex reasoning, polished outputs, collaboration, web-connected work, and tasks where quality matters more than cost.


Local AI for private drafts, repeatable internal tasks, sensitive documents, offline use, and experimentation where good-enough output is enough.


Human review for decisions, judgement, compliance, client communication, and anything where a wrong answer could cause harm.


That last part matters. AI can support decisions, but it does not verify them. Cost control is only one side of responsible AI use. Quality, privacy, accountability, and review still matter.


Infographic: The AI Bill Is Coming Due compares cloud AI vs local AI on a scale, urging small businesses to use hybrid AI.
Hybrid AI Solutions for Small Businesses: Balancing Usage and Hardware Costs to Optimize Efficiency and Control


What small businesses should do now


The first step is not to panic. The second step is not to buy a powerful AI workstation because someone on the internet said “local AI is the future” while standing next to a graph and a suspiciously expensive GPU.


The practical step is to audit your AI use.


Ask:


  • What AI tools are we using?

  • Are they flat-rate, usage-based, token-based, or credit-based?

  • Which features consume usage credits?

  • Are we using agents, automation, or long-document workflows?

  • Do we have spending caps or admin controls switched on?

  • Are we sending sensitive business or client data into cloud tools?

  • Could some routine tasks be handled locally instead?


Even a simple monthly review can help. AI costs should be treated like any other business subscription or software cost. If it is becoming part of your operations, it needs ownership.



The real lesson: AI strategy now includes cost strategy


AI adoption is no longer just about finding clever tools.


It is about deciding where AI genuinely helps, where it adds risk, and where the running costs make sense.


For small businesses, the risk is not that AI becomes unusable. The risk is sleepwalking into workflows that are harder to control than expected.


A £20 or £30 monthly tool is easy to approve. A growing stack of subscriptions, usage credits, API calls, agent runs, and hidden workflow costs is a different matter.


That does not mean businesses should avoid AI. It means they should use it deliberately.


The winners will not be the businesses that use the most AI. They will be the ones that understand which AI work is worth paying cloud prices for, which work can be done locally, and which work still needs a human in the loop.



Need help making sense of your AI setup?

If your business is starting to use AI tools but you are unsure what should stay in the cloud, what could run locally, and where the risks are, Mercia AI can help.


Our AI Readiness Consultation helps you assess your current position, identify practical opportunities, and understand the cost, privacy, and workflow implications before you scale your AI use.


AI should make your business clearer, not more complicated.


AI Readiness Consultation
£120.00
1h 30min
Book Now

FREE ai call

Book a FREE AI Call

Let's talk about how Mercia AI can help you

AI FOR Beginners

Introducing individuals and small businesses to AI in an accessible and engaging way.

AI for Small Businesses

For Small Business owners, Entrepreneurs and Freelancers looking to integrate AI into their work.

OTHER SERVICES

Check out our Services page to see other ways Mercia AI can help you.

bottom of page