Restaurant / Marketing · $500K-$2M
How a Restaurant Marketing Studio Runs Its Entire Content Stack Offline
Fully offline content engine on a single iMac. No OpenAI bill, no Claude bill, no cloud dependency, same quality outputs.
The problem
A restaurant marketing studio was spending around $1,200/month on cloud AI for content generation. Margins in hospitality-adjacent marketing are already thin. The owner asked a simple question: can we run this locally and keep the money?
The stack
Self-hosted GPT-OSS-20B (with Microsoft Phi as a fallback) running on a single iMac. Qdrant for the vector store. n8n orchestrating the content pipeline. Docker for everything. Cloudflare tunnels for secure remote access.
The architecture
The decision everyone gets wrong on self-hosted LLMs is model size. Teams pick the biggest model they can run, which eats RAM, slows iteration, and makes the developer experience miserable. We went the other way: tested 4 models under 30B parameters, picked the one that actually produced restaurant-voice copy well, accepted that we'd re-prompt more often, and got 10x the throughput.
The result
$0/month in cloud AI cost. Same output quality for the content tasks that matter (social posts, SEO briefs, email subject lines). Owner now has an asset that generates content forever without ever hitting an external API rate limit.
The cost math
Hardware: existing iMac. Monthly run cost: electricity. Replaced: around $1,200/mo cloud AI bill plus around $200/mo in just-in-case tier upgrades. Payback period: under 3 months.