Inference costs for large language models have dropped by more than half over the past twelve months. You would expect AI budgets to follow. They have not. Most agencies and product teams are spending more on AI than they were a year ago, not less. That gap is worth examining closely, because it tells you something useful about how organisations actually adopt new technology versus how they plan to.
The short version. cheaper inference does not mean cheaper AI. It means more AI. And more AI, run without a clear architecture, compounds your costs in ways that a per-token price cut does not fix.



