Yesterday I tried the obvious cost-saving move: run Ollama on the Mac mini and see how far local models could take OpenClaw.
The idea was simple. Frontier models are getting too expensive to use as the default engine for an assistant that reads files, browses, writes, routes tasks, and stays active for long stretches. So the question was whether local inference could carry more of the load.
Today I pushed the same experiment a bit further on a slightly more powerful Mac. Same goal, same question: can we get something cheap enough to run every day without the quality dropping off a cliff?
The answer, at least for now, is: not quite.
Local models are real. They are useful. They can absolutely handle part of the workload. But once the assistant stops being a toy and starts doing real work, the bar moves fast. Long context, tools, reliability, writing quality, coding quality, and session stability all matter at once.
At the same time, Anthropic finally did the thing everyone knew was coming: Claude subscriptions no longer cover OpenClaw and other third-party harnesses. That was inevitable. A subscription is a flat monthly product for human usage. An autonomous agent is infrastructure. Those are not the same economics.
So after trying the Mac mini yesterday and a stronger Mac today, we ended up in the most pragmatic place: Ollama Cloud.
Not because it is the purest option. Because it works.
Right now, that looks like the right compromise: use a cheaper cloud model through Ollama for the bulk of the work, and stop pretending that a consumer subscription can power a serious agent system forever.
That seems to be the real lesson here.
The subscription was never infrastructure.