How-To Guide

The AI SaaS Stack on Vercel in 2026: What We Actually Use to Ship Production AI Apps

12 May 202610 min read

Founders and CTOs ask us the same question every quarter: what stack would you pick today, in 2026, if you were starting an AI SaaS from scratch?

This is the answer — written after shipping rubrica.app on this exact stack. It is opinionated, narrow, and boring on purpose. The interesting work in an AI product is the AI pipeline and the unit economics, not the framework. The job of the stack is to get out of the way.

App Spotlight

Built on this exact stack

Rubrica: AI Rubric Feedback for Students

rubrica.app is the production AI SaaS we shipped on the stack above — Next.js, Vercel AI SDK, AI Gateway, Fluid Compute, Stripe credits. If you want a SaaS like it, this is the kind of work we deliver.

Learn more Download Free

Frontend & app — Next.js App Router on Vercel

Default to Next.js (App Router) on Vercel. Reasons in order of importance:

• Server Components — your AI calls live on the server, your tokens never touch the client, and your bundle stays small. • Streaming — long-running AI responses stream back to the UI without bespoke websocket infra. • Server Actions — mutations ("start a check", "buy credits") become a one-file end-to-end thing instead of a REST shuffle. • Cache Components — the static shell of every page is prerendered; only the live data wraps in Suspense.

We use TypeScript everywhere and Tailwind for styling. Boring is the point.

AI calls — the Vercel AI SDK + AI Gateway

We use the Vercel AI SDK for every model call and route them through the Vercel AI Gateway by default. The Gateway gives us provider strings like "anthropic/claude-haiku-4-5" or "openai/gpt-5" without committing to a single provider package, plus observability, failover, and zero data retention out of the box.

Why not call providers directly? Two reasons. First, model markets move fast — being able to A/B route between providers without code changes is a real operational win. Second, the AI Gateway handles fallbacks, so if a provider has an outage you don't go down with it.

For anything structured (JSON output, tool calls), the AI SDK's generateObject + Zod schemas have replaced our hand-rolled parsing entirely.

Compute — Fluid Compute, not edge

We default to Fluid Compute for AI routes. Edge Functions used to be tempting for low latency, but the Node compatibility issues never paid off for AI workloads — most useful libraries (PDF parsers, embeddings clients, tokenisers) want Node, not edge.

Fluid Compute reuses function instances across requests, which dramatically reduces cold starts on AI endpoints that load tokenisers or prompt templates. Combined with the 300-second default timeout, it handles long-running AI calls without bespoke infrastructure.

Storage — Blob for files, Marketplace for everything else

Uploaded files (rubrics, drafts, images for OCR) go to Vercel Blob. Public bucket for assets, private bucket for user-uploaded content. The API is small enough that we don't write an abstraction layer.

For relational data we provision a Neon Postgres instance from the Vercel Marketplace — Marketplace integrations auto-provision the environment variables so there's no juggling DATABASE_URLs by hand. For ephemeral caches (rate limits, idempotency keys) we use Upstash Redis, also via the Marketplace.

Vercel Postgres and Vercel KV are gone — don't reach for them out of memory. The Marketplace integrations are what you want now.

Payments — Stripe in credits mode

For AI products where usage is event-driven ("run a check", "generate an image"), pay-per-check credit balances are a better fit than subscriptions. Stripe handles the payment side; our app stores a credit ledger.

The stack: Stripe Checkout for top-ups, a webhook to credit the user's ledger, and a Server Action on every AI route that debits before the call and refunds on failure. Don't over-engineer it.

Observability — the boring stuff that saves you

Three things we wire up on day one for every AI SaaS:

• Vercel Analytics + Speed Insights for the marketing-site side. • A model-call log table in Postgres (request id, model, tokens, latency, success). Every AI route writes one row. This is your eval data later. • Sentry for errors, with the AI route name and model id tagged on every event.

Nothing exotic. The trick is having it from day one, not bolting it on at month six.

What we deliberately leave out

Things you don't need on day one:

• A bespoke queue. Fluid Compute handles long requests; you only need queues when you have minutes-long jobs or fan-out. If you do, reach for Vercel Queues or Vercel Workflow. • An ORM. The Postgres driver and hand-written SQL beats a heavy ORM for AI apps where most queries are simple. • A monorepo. One repo per product. You can split later if you must. • A custom auth layer. Clerk via the Vercel Marketplace handles 95% of cases and provisions env vars automatically.

The goal is to ship the product, not the platform. This stack is what got rubrica.app from idea to live SaaS — and it's what we reach for first when a client asks us to ship theirs.

FAQ

Frequently asked questions

Why not just use OpenAI's API directly?

Calling providers directly works for a prototype, but it locks you into one provider and gives you no observability. The Vercel AI SDK + AI Gateway lets you swap models without code changes and gives you logs, failover, and zero data retention by default.

Is this stack overkill for a small AI SaaS MVP?

No — every component scales down to zero cost or near-zero cost on Vercel. The benefit of starting on this stack is that you don't have to rewrite when traffic actually arrives.

Can Kinexapps build my AI SaaS on this stack?

Yes — this is our default stack for AI SaaS work. We've shipped rubrica.app on it end-to-end and can do the same for your product. Get a free quote via the contact page.

Get a Free Quote — AI SaaS on Vercel

All Kinexapps apps are free to download on the App Store. No subscriptions, no paywalls.

Download Rubrica Browse all articles