Inference budgets are overrunning by "orders of magnitude" - what now?

Goldman Sachs: "Industry is countering by redistributing value [via] open source distilled models and the focus on roadmaps weighted to proprietary SLMs..."

By Edward Targett

Updated Apr 21, 2026 2:20 PM / Published Apr 20, 2026 4:49 PM 4 min read

Inference budgets are overrunning by "orders of magnitude" - what now? — Image credit: https://unsplash.com/@towfiqu999999

Many companies are overrunning their initial AI inference budgets “by orders of magnitude,” according to equity analysts at Goldman Sachs.

That will be little surprise to many CTOs and engineering leaders, but the scale at which costs are running away from many organisations is significant.

Inference costs were approaching 10% of engineering headcount cost in one software company the investment bank surveyed – and the firm said it expects them to reach parity with total engineering headcount cost soon.

That’s according to a 42-page research paper for clients, seen by The Stack.

Goldman’s team met with 30 private companies and 10 public companies – including Autodesk, CrowdStrike, Rubrik, Workday, and Zscaler, as well as leading VCs to discuss the trends they are seeing around inference use – and the extent to which software firms have a “moat” against AI dominance.

“The industry is countering”

This content is for paying members only

Already have an account? Sign in

Add The Stack on Google

Edward Targett

Ed is a co-founder of The Stack. He previously edited Computer Business Review. He has also covered energy markets. He started his journalism career on local papers. He left school at 15 and has made a living asking "but why?" ever since.