We Audited 3 AI-Generated Codebases. Here's What Broke.
We reviewed 3 backends built entirely with Cursor, Bolt, and Lovable. All 3 had critical security vulnerabilities. Here's the teardown.
We reviewed 3 backends built entirely with Cursor, Bolt, and Lovable. All 3 had critical security vulnerabilities.
These aren't edge cases. These are real codebases from real SaaS founders who shipped to real users. They came to us after things started breaking — or worse, after they realized things could break.
Here's what we found.
Codebase 1: Cursor + Next.js + Supabase
The founder: B2B SaaS, 400 users, $80K ARR. Built the entire backend with Cursor over 3 weeks.
The problem: Auth bypass via broken RLS
Cursor generated Supabase Row Level Security (RLS) policies for every table. On paper, they looked correct. In practice, they had a critical gap: the policies checked auth.uid() but didn't verify the user's organization. Any authenticated user could read any other organization's data by changing the org ID in the request.
-- What Cursor generated (looks fine)
CREATE POLICY "Users can read own data"
ON public.invoices
FOR SELECT
USING (auth.uid() = user_id);
-- What was missing (org-level isolation)
CREATE POLICY "Users can read org data"
ON public.invoices
FOR SELECT
USING (
auth.uid() IN (
SELECT user_id FROM org_members
WHERE org_id = invoices.org_id
)
);
Impact: Full cross-tenant data exposure. Any user could access any organization's invoices, customers, and financial data.
Fix time: 4 hours. The RLS policies needed to be rewritten with proper organization-level checks. But finding this required understanding multi-tenant architecture — something AI code generators consistently miss.
What this would have cost in a breach: $50K+ in legal costs, regulatory fines, and customer churn. The audit cost $1,299.
Codebase 2: Bolt.new + Express + MongoDB
The founder: HR Tech SaaS, pre-launch. Built with Bolt.new in a weekend hackathon and was about to go live.
The problem: Zero input validation + hardcoded secrets
Bolt.new generated a functional Express API. Every endpoint worked in Postman. But not a single endpoint validated input.
// What Bolt.new generated
app.post('/api/employees', async (req, res) => {
const employee = await Employee.create(req.body);
res.json(employee);
});
// What should exist
app.post('/api/employees', async (req, res) => {
const validated = employeeSchema.parse(req.body);
const employee = await Employee.create(validated);
res.json(employee);
});
Sending { "role": "admin", "__v": 0, "salary": 999999 } as the body worked. Mass assignment vulnerability. NoSQL injection was also possible through unvalidated query parameters.
But the worst part: the MongoDB connection string (with full admin credentials) was hardcoded in the frontend .env file — and the .env file was committed to a public GitHub repo.
Fix time: 12 hours. Added Zod validation to every endpoint, moved secrets to server-side environment variables, added .env to .gitignore, rotated all credentials.
Why AI tools get this wrong
Most AI code generators optimize for "does it work?" — not "is it secure?" They generate the happy path. Input validation, error handling, and secret management are defensive code that only matters when something goes wrong. AI tools almost never generate it unprompted.
Codebase 3: Lovable + Supabase + Stripe
The founder: Subscription SaaS, 200 paying users. Built with Lovable, in production for 3 months.
The problem: Stripe webhook verification missing
The Stripe webhook endpoint accepted any POST request without verifying the signature. This means anyone who knew the endpoint URL could fake payment confirmations.
// What Lovable generated
app.post('/api/webhooks/stripe', async (req, res) => {
const event = req.body;
// Process payment...
await updateSubscription(event.data.object);
res.json({ received: true });
});
// What should exist
app.post('/api/webhooks/stripe', async (req, res) => {
const sig = req.headers['stripe-signature'];
const event = stripe.webhooks.constructEvent(
req.body,
sig,
process.env.STRIPE_WEBHOOK_SECRET
);
await updateSubscription(event.data.object);
res.json({ received: true });
});
Even worse: the subscription upgrade/downgrade logic was implemented client-side. A user could modify their subscription tier by changing a JavaScript variable in the browser console.
Impact: The founder had been "in production" for 3 months without knowing that anyone could give themselves a free premium subscription.
Fix time: 8 hours. Added webhook signature verification, moved all subscription logic to server-side, added Stripe event idempotency handling.
The pattern: What AI code generators consistently get wrong
After 15+ audits, we see the same 5 categories of failure:
1. Auth and authorization
AI tools generate auth flows that work in the demo. They almost never handle edge cases: token rotation, session invalidation, multi-tenant isolation, role-based access control at the database level. RLS policies look correct but have logic gaps.
2. Input validation
AI-generated code trusts all input. No Zod schemas. No server-side validation. The assumption is that the frontend will only send valid data — which is true until someone opens the browser console.
3. Secret management
API keys in .env files that are committed to Git. Service role keys in client-side code. Secrets passed as query parameters. AI tools don't understand the security boundary between server and client.
4. Webhook verification
Every Stripe, Clerk, and Supabase webhook endpoint we've audited from AI-generated code was missing signature verification. The AI generates the endpoint but skips the security layer.
5. Error handling
AI code handles the happy path. When something goes wrong — a database timeout, an API rate limit, a malformed response — the error either crashes the server or gets silently swallowed. No retries, no fallbacks, no alerting.
The 45% stat is real
Research from Stanford University found that developers using AI assistants produced significantly less secure code than those who didn't — and were more confident that their code was secure. The false confidence is the real danger.
This doesn't mean AI tools are bad. We use Claude Code, Cursor, and Copilot every day. The difference: we have 7+ years of backend engineering telling us when the output is wrong.
What you should do right now
If your backend was built primarily with AI tools, ask yourself:
- Can you query another user's data from the browser? Test it. Right now.
- Are your Supabase/database credentials in client-side code? Check your
.envfiles and your Git history. - Do your webhook endpoints verify signatures? If you're not calling
stripe.webhooks.constructEvent(), they don't. - Is every API endpoint validating input? If you're not using Zod (or equivalent), it's probably not.
- What happens when your API returns an error? Does your app handle it gracefully, or does it show a stack trace?
If the answer to any of these is "I'm not sure" — that's the answer.
Ready to get started?
Worried about your vibe-coded backend? We check everything AI generated and give you a prioritized fix list.