From Vibe-Coded MVP to Production-Grade SaaS in 6 Weeks: A Case Study
A founder shipped an MVP with Cursor in 2 weeks. It got traction. Then it started breaking. Here's how we rescued it and made it production-grade.
A founder shipped a SaaS MVP with Cursor in 2 weeks. The product was a project management tool for construction companies — think Asana, but with industry-specific features like permit tracking, subcontractor management, and daily site reports.
He posted it on LinkedIn. Three construction companies signed up that week. Within a month, he had 47 paying users and $4,200 MRR.
Then it started breaking.
The "before" snapshot
When he came to us, the codebase was 100% AI-generated. Cursor + Next.js 14 + Supabase + Stripe. Here's what we found during the initial audit.
Authentication: Leaking across tenants
The RLS (Row Level Security) policies were checking auth.uid() but not the organization. User A from Company X could see User B's data from Company Y by modifying the API request. This is the same cross-tenant data exposure bug we described in our codebase audit post, and it shows up in almost every AI-generated multi-tenant app.
In this case, 47 paying companies were sharing a database with no tenant isolation. Any user who knew how to open browser dev tools could access any other company's project data, financials, and subcontractor information.
Error handling: None
The entire application had zero try-catch blocks. Zero error boundaries. Zero fallback UI. When the Supabase connection timed out (which happened during peak hours because of the connection pooling issue below), users saw a raw Next.js error page with a stack trace.
In production. With paying customers.
Webhook handling: Race conditions
The Stripe webhook endpoint had no signature verification and no idempotency handling. Stripe's retry mechanism was firing duplicate events, causing some users to be charged twice and others to have their subscription status flip between "active" and "canceled" multiple times per hour.
The founder was manually fixing subscription statuses in the Supabase dashboard every morning. He'd been doing this for three weeks.
N+1 queries: 8-second page loads
The project dashboard — the page every user hits first — was making 47 individual database queries to render a single page. One query for the project list, then one query per project for the latest activity, then one query per project for the team members.
For a user with 15 projects, that's 31 queries to render a dashboard. Page load: 8.2 seconds. On mobile: timeout.
// What Cursor generated — classic N+1
export async function getProjects(userId: string) {
const { data: projects } = await supabase
.from("projects")
.select("*")
.eq("user_id", userId);
// N+1: one query per project for activity
for (const project of projects) {
const { data: activity } = await supabase
.from("activity_logs")
.select("*")
.eq("project_id", project.id)
.order("created_at", { ascending: false })
.limit(5);
project.recentActivity = activity;
}
// N+1 again: one query per project for members
for (const project of projects) {
const { data: members } = await supabase
.from("project_members")
.select("*, users(*)")
.eq("project_id", project.id);
project.members = members;
}
return projects;
}
Connection pooling: Not configured
Supabase's default connection limit is 60 for the Pro plan. The app was opening a new connection for every request and never closing them. During peak hours (7-9 AM when construction teams start their day), connections exhausted within minutes. The entire app went down for all users.
This happened 3-4 times per week. The founder would restart the Supabase project from the dashboard, which would kill all connections and temporarily fix it — until the next morning.
Monitoring: Zero visibility
No error tracking. No performance monitoring. No uptime monitoring. No alerting. The founder learned about outages from angry customer emails. Sometimes hours after the outage started.
The numbers
| Metric | Before |
|---|---|
| Page load (dashboard) | 8.2 seconds |
| Errors per day | ~120 |
| Uptime (30-day) | ~94% |
| Connection pool exhaustion | 3-4x/week |
| Webhook failures | ~20% of events |
| Time to detect outages | 1-3 hours |
| Security vulnerabilities | 7 critical |
Week 1-2: Security audit and critical fixes
We don't start with refactoring. We start with the things that can destroy the business overnight.
Day 1-2: Threat assessment
We mapped every API endpoint, every database table, and every RLS policy. The full threat assessment found 7 critical vulnerabilities:
- Cross-tenant data access (no org-level RLS)
- Stripe webhook endpoint accepting unverified requests
- Supabase service role key exposed in client-side code
- No input validation on any API route
- File upload endpoint accepting any file type with no size limit
- Password reset flow with no rate limiting
- Admin API routes with no role-based access control
Day 3-5: Auth and tenant isolation
We rewrote every RLS policy with organization-level checks:
-- Before: user-level only (broken for multi-tenant)
CREATE POLICY "select_projects" ON projects
FOR SELECT USING (auth.uid() = created_by);
-- After: organization-level isolation
CREATE POLICY "select_projects" ON projects
FOR SELECT USING (
EXISTS (
SELECT 1 FROM org_members
WHERE org_members.user_id = auth.uid()
AND org_members.org_id = projects.org_id
)
);
Every table got the same treatment. We also added a test suite that attempts cross-tenant access for every RLS policy — if any test passes, the CI pipeline fails.
Day 6-8: Webhook security
We implemented proper Stripe webhook handling with signature verification, idempotency, and event ordering. The same pattern from our auth loops post: verify the signature, check for duplicate events, process idempotently.
We also backfilled subscription statuses from Stripe's source of truth. 6 users had incorrect subscription states in the database. All 6 were paying customers who had been silently downgraded to the free tier by duplicate webhook events.
Day 9-10: Secret rotation and input validation
We moved the Supabase service role key to server-side only, added the server-only import guard, and rotated every credential. Then we added Zod validation to every API endpoint and form submission.
import { z } from "zod";
const createProjectSchema = z.object({
name: z.string().min(1).max(100),
description: z.string().max(2000).optional(),
startDate: z.string().datetime(),
endDate: z.string().datetime().optional(),
budget: z.number().positive().max(100_000_000).optional(),
type: z.enum(["residential", "commercial", "infrastructure"]),
});
Every endpoint went from "trust whatever the client sends" to "validate everything, reject anything unexpected."
End of Week 2 status: All 7 critical vulnerabilities closed. Zero cross-tenant data access possible. Webhooks processing correctly. Secrets secured.
Week 3-4: Backend refactor
With the security fires out, we moved to performance and reliability.
Connection pooling
We configured Supabase's built-in connection pooler (PgBouncer) in transaction mode and updated the application to use the pooled connection string:
// Before: direct connection (exhausts pool)
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
// After: pooled connection via PgBouncer
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!,
{
db: {
schema: "public",
},
auth: {
persistSession: false,
},
}
);
On the Supabase side, we configured the pooler for the connection string used in server-side code and set the pool size to match the expected concurrent users. Connection exhaustion stopped immediately.
Killing the N+1 queries
We replaced the loop-based data fetching with a single query using Supabase's nested select:
// After: single query with joins
export async function getProjects(orgId: string) {
const { data, error } = await supabase
.from("projects")
.select(`
*,
activity_logs (
id, action, created_at, user:users(name)
),
project_members (
user:users(id, name, avatar_url),
role
)
`)
.eq("org_id", orgId)
.order("updated_at", { ascending: false })
.limit(20);
if (error) throw new DatabaseError("Failed to fetch projects", error);
return data;
}
One query instead of 31. We also added database indexes on the columns used in WHERE clauses and ORDER BY:
CREATE INDEX idx_projects_org_updated ON projects (org_id, updated_at DESC);
CREATE INDEX idx_activity_logs_project ON activity_logs (project_id, created_at DESC);
CREATE INDEX idx_project_members_project ON project_members (project_id);
Response caching
For data that doesn't change frequently (project metadata, team members), we added a caching layer using Next.js unstable_cache with revalidation:
import { unstable_cache } from "next/cache";
export const getCachedProjects = unstable_cache(
async (orgId: string) => getProjects(orgId),
["projects"],
{
revalidate: 60, // Revalidate every 60 seconds
tags: ["projects"],
}
);
When a project is updated, we invalidate the cache tag:
import { revalidateTag } from "next/cache";
export async function updateProject(projectId: string, data: ProjectUpdate) {
await supabase.from("projects").update(data).eq("id", projectId);
revalidateTag("projects");
}
Error boundaries and handling
We added error boundaries at every route segment, a global error handler, and try-catch blocks around every database and external API call:
// app/dashboard/error.tsx
"use client";
export default function DashboardError({
error,
reset,
}: {
error: Error & { digest?: string };
reset: () => void;
}) {
return (
<div className="flex flex-col items-center justify-center p-8">
<h2 className="text-xl font-semibold">Something went wrong</h2>
<p className="mt-2 text-muted">
We've been notified and are looking into it.
</p>
<button
onPress={reset}
className="mt-4 rounded-lg bg-primary px-4 py-2 text-white"
>
Try again
</button>
</div>
);
}
Every external call got a wrapper with retry logic, timeout, and structured error logging:
async function withRetry<T>(
fn: () => Promise<T>,
options: { retries: number; timeout: number; label: string }
): Promise<T> {
for (let attempt = 1; attempt <= options.retries; attempt++) {
try {
const controller = new AbortController();
const timeoutId = setTimeout(
() => controller.abort(),
options.timeout
);
const result = await fn();
clearTimeout(timeoutId);
return result;
} catch (error) {
if (attempt === options.retries) {
logger.error(`${options.label} failed after ${options.retries} attempts`, {
error,
});
throw error;
}
// Exponential backoff
await new Promise((r) => setTimeout(r, 1000 * Math.pow(2, attempt)));
}
}
throw new Error("Unreachable");
}
End of Week 4 status: Dashboard loads in under 500ms. No connection pool exhaustion. Proper error handling everywhere. Caching layer active.
Week 5-6: Monitoring, CI/CD, and hardening
Monitoring and alerting
We set up a complete observability stack:
- Error tracking with Sentry — every uncaught error gets captured with context, user info, and breadcrumbs.
- Uptime monitoring with BetterStack — checks every 30 seconds, alerts via Slack and SMS within 60 seconds of downtime.
- Performance monitoring — Core Web Vitals tracked per page, with alerts when LCP exceeds 2.5 seconds.
- Custom metrics — webhook processing time, database query duration, API response times. All dashboarded.
The founder went from "I find out about outages from customer emails" to "I get a Slack alert within 60 seconds with the root cause."
CI/CD pipeline
Before: deployment was git push to a Vercel branch. No checks. No tests. No gates.
After:
- Type checking —
tsc --noEmitcatches type errors before deployment - Linting — Biome enforces code quality standards
- Security tests — Cross-tenant access tests run on every PR
- Preview deployments — Every PR gets a preview URL for manual testing
- Production deployment — Only from the main branch, only after all checks pass
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
- run: pnpm install --frozen-lockfile
- run: pnpm check-types
- run: pnpm lint
- run: pnpm test:security
- run: pnpm build
Load testing
We ran load tests simulating 200 concurrent users during peak morning hours. Before our changes, the app fell over at 30 concurrent users. After:
- 200 concurrent users: all requests under 800ms
- 500 concurrent users: p95 at 1.2 seconds, no errors
- 1000 concurrent users: p95 at 2.8 seconds, 0.1% error rate
The app could now handle 10x its current user base without any infrastructure changes.
Documentation and handoff
We documented everything:
- Architecture decision records (ADRs) for every major technical choice
- Runbook for common operational tasks (restart, rollback, scale)
- Database schema documentation with relationship diagrams
- API endpoint documentation with request/response examples
The founder could now onboard a developer who would understand the system without reverse-engineering it.
The "after" numbers
| Metric | Before | After | Change |
|---|---|---|---|
| Page load (dashboard) | 8.2s | 380ms | 95.4% faster |
| Errors per day | ~120 | 2-3 | 97.5% reduction |
| Uptime (30-day) | ~94% | 99.95% | Production-grade |
| Connection pool exhaustion | 3-4x/week | 0 | Eliminated |
| Webhook success rate | ~80% | 99.9% | Reliable |
| Time to detect outages | 1-3 hours | < 60 seconds | Real-time |
| Security vulnerabilities | 7 critical | 0 | Closed |
| Concurrent user capacity | ~30 | 500+ | 16x increase |
What we learned (again)
This project reinforced patterns we see in every vibe-coded rescue:
AI tools are excellent at scaffolding
Cursor generated a working MVP in 2 weeks. The UI was clean, the feature set was right, the product-market fit was validated quickly. That's genuinely valuable. The founder would not have gotten 47 paying customers without the speed that Cursor enabled.
AI tools are terrible at production engineering
Auth, security, performance, error handling, monitoring, connection management, webhook idempotency — these are all patterns that require understanding why they exist, not just how to implement them. AI tools generate the "how" without the "why," which means they skip it entirely when it's not explicitly requested.
The gap is predictable
Every vibe-coded app we audit has the same categories of issues: auth/security, performance (N+1 queries, missing indexes, no caching), error handling (none), and operational readiness (no monitoring, no CI/CD, no documentation). The specifics vary. The categories don't.
The rescue window is narrow
This founder came to us at $4,200 MRR with 47 users. If he'd waited another 3 months — more users, more data, more compounding technical debt — the rescue would have been twice as expensive and taken twice as long. The best time to fix a vibe-coded MVP is right after it gets traction.
The takeaway
Vibe coding is not the problem. Stopping at the vibe code is.
If your AI-generated MVP has traction, you're sitting on a validated product with an unvalidated foundation. The product-market fit is real. The code isn't ready for what comes next. The longer you wait, the more expensive the rescue becomes.
The founder in this case study went from "my app breaks every morning" to "I haven't thought about infrastructure in two months." His MRR grew from $4,200 to $11,800 in the three months after the rescue — because he could finally focus on the product instead of firefighting.
That's the ROI of production-grade engineering: it's not a cost, it's what lets you grow.
Ready to get started?
In the same situation? We take your vibe-coded MVP and make it production-grade in 4-6 weeks.