The Real Unit Economics of AI Apps

The default assumption in consumer SaaS is that variable cost per user is roughly zero — once the platform is built, each marginal user is free to serve. AI applications break that assumption hard. Every active user costs real money in inference. Here is what we actually see across seven products at Virtual Minds.

The cost shape of a typical AI app

For a representative product like Room AI, the per-user monthly cost during active usage breaks down roughly:

Image generation — by far the largest line item. Each generation costs a few cents at provider rates; an engaged user runs 10-30 generations per month.
Language model inference — prompt construction, structured output extraction, style suggestions. With Claude caching enabled, this is meaningful but not dominant.
Storage + bandwidth — Cloud Storage for generated images, Firestore reads for user library, CDN egress. Real but contained.
Auth + platform infrastructure — close to zero per user.

For a free user that converts to paid, the math has to work given a roughly 6-month average subscriber lifetime and platform fees from the App Store or Play Store.

Why the cheap model cascade matters

The single highest-leverage cost optimization we have made is the model cascade described in our Claude post. Routing the majority of traffic to Claude Haiku, escalating to Sonnet only when Haiku fails validation, cuts language model cost by roughly 70 percent versus defaulting to Sonnet.

For image generation, the same principle applies — different models for different fidelity requirements. A preview thumbnail does not need the highest-quality model.

The free tier trap

The hardest unit-economics decision for any AI app is how generous the free tier should be. Too generous and you bleed money on users who never convert. Too stingy and you lose the demo-driven conversion mechanic that AI apps depend on.

Our current approach: free tier covers the first session generously, then ratchets down rapidly. Free users see enough quality to want to subscribe; they do not get a permanent free product disguised as a trial.

What we measure weekly

Every Monday morning the internal dashboard reports for each product:

Cost per active user (CPAU) — total infra + AI cost / monthly active users
Cost per converting user (CPCU) — same denominator restricted to users who subscribed
Gross margin per active subscriber — average revenue per paid user minus their fully-loaded cost
Free-tier burn — total cost attributable to users who have never paid

These four numbers do more to drive engineering decisions than any roadmap document. If CPAU is climbing faster than ARPU, something is wrong — usually a model cascade that has drifted toward escalating too often, or a feature that quietly added expensive inference paths.

The path to durable margin

Our long-term margin story rests on three things:

Model prices keep falling. Claude Sonnet is roughly 50% cheaper than it was 18 months ago. This is not unique to us — every AI app benefits.
Platform leverage compounds. Each new product makes the Cortex platform investment more leveraged. The fifth app shares the same infrastructure as the first.
Distribution costs amortize. Marketing spend on one app surfaces users for the others through cross-promotion within the suite.

If you are building a single AI app today, the unit economics will get better as model prices fall. If you are building a portfolio of AI apps, you also benefit from the platform compounding. We picked the second strategy on purpose.

The Real Unit Economics of AI Apps

The cost shape of a typical AI app

Why the cheap model cascade matters

The free tier trap

What we measure weekly

The path to durable margin

More from Platform

AI Agent Architecture for Consumer Apps

From One App to Seven: The Cortex Platform Story