Pillar guide · 16 min read

The Future of Tech DD: Investing in a Vibe-Coded World

Why Tech DD is being rewritten for an AI-built software economy. How investor diligence will change over the next 3–5 years as vibe-coded and traditional apps converge — what founders get wrong about AI builders, and where their data actually ends up.

Venture CapitalCorporate DevelopmentStrategic Buyer

Written by Hutton Henry

Founder, Beyond M&A · Creator, Lens

Last reviewed 20 May 2026

How we research

Executive summary

Within three years a majority of net-new SaaS in the lower mid-market will be majority AI-generated. That doesn't kill Tech DD — it inverts it. Code quality stops being the headline risk; provenance, data lineage, and operator dependency become the new red flags. Investors who keep running 2019-era Tech DD on 2027-era targets will systematically overpay for fragile assets and pass on durable ones. This is our forward view of how diligence, valuation, and post-close integration change — and the three things founders consistently get wrong about where their data is going.

01By 2027, AI-assisted code will be the default, not the exception. Diligence reframes from 'is the code good?' to 'is the system intelligible and defensible?'
02Data lineage replaces code review as the single most important Tech DD workstream. Where customer data sits, who can see it, and which third-party LLMs have processed it become first-order valuation inputs.
03Founders consistently misunderstand three things: what the AI builder retains, what leaks into the client bundle, and what their hosting provider is actually contractually allowed to do with their database.
04Operator dependency, not code quality, is the dominant risk. A clean codebase nobody understands is worth less than a messy one with a maintainer.
05Traditional and vibe-coded apps will converge — not because vibe-coded apps get more rigorous, but because traditional shops adopt the same agents. The diligence framework has to work for both.

Tech DD as practised today is a product of the 2010s. It assumes a target with a small engineering team, a Git history that reflects human decisions, a codebase that can be read top-to-bottom in a week, and a security posture that is mostly a function of how disciplined that team was. The questions on the checklist — language choice, test coverage, CI maturity, on-call rotation — all presuppose that humans wrote the software and humans maintain it.

That world is ending faster than most investment committees realise. The interesting question is not whether vibe coding is "real engineering" (an argument that has already been lost on both sides) but what Tech DD looks like when half the deal flow is software that no human fully authored.

Why this matters for investors, not just engineers

There are three structural shifts happening at once, and they compound:

Cost of shipping has collapsed. A solo founder with a Lovable or Cursor subscription can now produce, in a weekend, what required a seed-funded team of four in 2021. This means more targets, faster, at smaller cheque sizes — and a long tail of "businesses" that are really validated prototypes wearing SaaS clothing.
Quality variance has exploded. The best AI-assisted teams are shipping faster and with fewer defects than their 2022 selves. The worst are shipping production systems with credentials in the client bundle and no Row-Level Security. The gap between P10 and P90 quality in any given vintage is now wider than at any point in the last twenty years.
The traditional shops are converging downward — or upward, depending on your view. By 2026 it will be vanishingly rare to find a 50-person engineering team that isn't using Copilot, Cursor, or Claude Code for the majority of net-new code. The line between "vibe-coded startup" and "real engineering org" blurs every quarter. Diligence frameworks that depend on that distinction will stop working.

The investor takeaway: you cannot reliably price these assets with a 2019 Tech DD template. You will either reject good deals because the codebase "looks AI-generated" or, more dangerously, pay SaaS multiples for what is really a UI on top of someone else's infrastructure.

What Tech DD looks like in 2027

Our working forecast — informed by the 80+ targets we have assessed in the last 18 months where AI tooling was material — is that the diligence stack reorganises around four questions, in this order:

1. Data lineage — the new code review

Where does customer data physically sit? Which third parties (LLM providers, vector DB hosts, observability vendors, AI builder platforms) have processed it, and under what terms? Can the target produce a data flow diagram that matches reality?

This becomes the single highest-leverage workstream because it is the area where vibe-coded targets are most likely to have created liabilities they don't know exist. A founder who prompted "add Stripe checkout" may not realise the AI agent also wired up an analytics SDK that pipes every page view, including PII in URL params, to a US-based vendor whose DPA they never signed.

2. Provenance and intelligibility

Not "is the code clean?" but "can a new engineer load the system into their head in a week?" The artefacts that matter: the prompt log (or its absence), the architectural decision record, the migration history, the deployment runbook. A 50,000-line codebase that a competent mid-level engineer can fully grasp in five days is more valuable than a 10,000-line codebase that requires the founder's continual interpretation.

3. Operator dependency and the human factor

This replaces "key-person risk" as the dominant people question. The framing shifts from "what happens if the CTO leaves?" to "is there anyone in the organisation who can debug a production incident at 2am without re-prompting the AI?" Many vibe-coded targets answer no. That is not automatically disqualifying, but it has to be priced.

4. Defensibility of the system, not the code

In a world where any competitor can re-generate functionally equivalent code in a weekend, what makes this product defensible? Data network effects, distribution, brand, regulatory moats, integration depth — none of which appear on a code-quality scorecard. Tech DD has to start asking commercial questions it has historically left to commercial DD.

The classic workstreams — security, scalability, IP, open-source compliance — do not go away. They become table stakes, the floor not the ceiling.

What founders are getting wrong about vibe coding

We see the same three misconceptions repeatedly, and each one creates a specific diligence finding that lands in our reports.

Misconception 1: "The AI doesn't keep my code or data"

Most AI builder platforms have a free tier or hobbyist plan whose terms of service explicitly reserve the right to train on, retain, or analyse user inputs and outputs. The paid tiers usually carve this out — but the carve-out is conditional on which features are enabled (preview sharing, public projects, telemetry), and founders routinely sign up on the free plan, ship a real product on it, and only upgrade after the data has already been ingested.

The diligence question is not "do you use Lovable / Cursor / v0?" It is "which plan, since when, and what was your project's visibility setting before the upgrade?"

Misconception 2: "My secrets are safe because they're in environment variables"

This is the single most common Sev-1 finding on vibe-coded targets. The founder believes secrets are server-side because the AI agent said it put them in the .env file. In reality, any variable prefixed VITE_, NEXT_PUBLIC_, or REACT_APP_ is shipped to every visitor's browser. We routinely find Stripe live keys, OpenAI API keys, Supabase service-role keys, SendGrid keys, and admin webhook secrets in the public JavaScript bundle of targets pitching at 6–8x ARR.

The AI didn't lie — it put the variable where the founder asked. The founder didn't know which prefix means "public". Nobody reviewed it. The first time anyone checks is during diligence, and by then those keys have been live and indexable on the public internet for months.

Misconception 3: "My data is in my database, which is mine"

Most vibe-coded apps sit on a managed BaaS — Supabase, Firebase, Neon, PlanetScale, or a hosted Postgres. The customer relationship, the SLA, the data residency, the backup retention, and the ability to actually export the data in a portable form all depend on the BaaS terms, not on the founder's intuition that "it's my database."

Three sub-issues we flag repeatedly:

No RLS, or RLS only on the tables the founder remembered. The default posture of a fresh Supabase project is that with the anon key, every row in every table is readable. Vibe-coded apps inherit this. We have seen targets where competitors could enumerate the entire customer list with a single curl request.
No data residency guarantee. A UK-based target selling into UK SMEs may be storing all customer data in us-east-1 because that's the default region the AI selected. Under UK GDPR, that is a disclosable international transfer with specific obligations the founder has not met.
No real backup strategy. The BaaS does point-in-time recovery for 7 days on the plan they're on. The founder believes this means "my data is backed up." It does not mean they can restore to a position from three months ago when a bad migration corrupted a table.

Where the data actually goes

For an investor reading a vibe-coded target's data flow, the honest answer to "where is my data?" is usually some combination of the following, none of which appear on the pitch deck:

The application database — typically a managed Postgres at a US or EU hyperscaler, owned by the BaaS vendor, governed by their DPA.
The AI builder platform's project storage — every prompt the founder ever wrote, plus snapshots of generated code, retained per the builder's plan terms.
The LLM provider behind the builder — OpenAI, Anthropic, Google, or a mix, each with their own retention windows (typically 30 days for API traffic on enterprise plans, longer or indefinite on free/consumer tiers).
Embedded analytics and error tracking — PostHog, Sentry, LogRocket, Vercel Analytics, or whichever the AI inserted by default, each with their own data residency and PII handling.
The deployment platform's edge logs — Vercel, Cloudflare, Netlify all log request metadata, often including URL parameters and headers that contain user data.
The payment processor — Stripe or similar, governed by their own terms, usually fine but rarely mentioned in the target's privacy policy.
Any "AI features" inside the product itself — if the app calls OpenAI to summarise a customer document, that document just left the perimeter. The founder may not have a DPA in place to make that legal.

Six to eight data processors, on average, for an app the founder describes as "self-hosted on our own Postgres." None of this is uniquely bad — traditional SaaS has the same surface area — but it has to be enumerated, mapped, and contractually papered. Most vibe-coded targets cannot produce that map on day one of diligence.

How the investor playbook changes

A few practical shifts we are already making in our own engagements, which we expect to become standard across the industry in the next 24 months:

Earlier security touch. Pre-LOI, not post-LOI. A 90-minute scan of the client bundle and the public BaaS endpoints catches the deal-breaking issues before either side has spent legal fees.
Data-flow mapping as a deliverable in its own right. Not buried in the security appendix — a standalone artefact the buyer's DPO can use directly.
Rebuild-cost as a valuation anchor. For sub-£5m targets, the question "what would it cost to rebuild this from the spec in 8 weeks?" is now a meaningful valuation floor. If the answer is "£150k", then £2m of goodwill needs to be justified by distribution, data, or contracts — not code.
A maintainer covenant. Post-close, the seller commits to a 90-day window in which a buyer-nominated engineer can shadow them and document the system. This costs the seller little and de-risks the buyer substantially.
Explicit AI-tooling reps and warranties. Standard SPAs do not yet have language covering "no confidential customer data was processed by an AI service without a DPA." They will within 18 months. Early movers are already adding it.

The bigger picture

Vibe coding is not a fad and it is not the end of software engineering. It is the largest shift in how production software gets built since the move from on-prem to cloud, and it is happening on a timescale of quarters rather than years. The Tech DD that worked in 2019 will produce systematically wrong answers about 2027 targets — sometimes too conservative, sometimes catastrophically too permissive.

The opportunity for investors who get ahead of this is significant. Two-thirds of the targets we see flagged as "low quality, AI-generated, pass" are actually well-positioned businesses with cheap, fixable hygiene issues. A smaller but non-trivial fraction of the ones flagged as "clean, traditional, proceed" are sitting on undisclosed data-processing liabilities that would not survive a serious DPA audit. The diligence framework that distinguishes between them is the edge.

It is not a code-quality question any more. It is a systems, data, and operator question. Tech DD is becoming what it probably should have been all along.

Where to start

If you are an investor with a live deal where the target was built primarily with AI tooling, the companion piece — Tech Due Diligence on Vibe-Coded Apps — is the operational checklist. This article is the thesis; that one is the playbook.

If you are a founder reading this because you built the product yourself and now have a term sheet, the single most useful thing you can do before diligence starts is open your deployed JavaScript bundle in a browser, search it for the strings sk_live, service_role, SUPABASE_SERVICE, and OPENAI_API_KEY, and address whatever you find. That one check, before a buyer's engineer runs it, is worth more than any pitch-deck revision.

Frequently asked

Is vibe coding really going to dominate, or is this hype?+

Both. The hype cycle around specific tools is real and will pass. The underlying shift — that AI agents will write the majority of net-new application code by 2027 — is supported by adoption data from every major engineering org we work with. Diligence frameworks need to assume convergence, not a permanent two-tier market.

Does this mean traditional Tech DD is obsolete?+

No. It means the weighting changes. Security, scalability, IP, and open-source compliance remain table stakes — the floor every target must clear. The differentiated diligence work moves to data lineage, system intelligibility, operator dependency, and defensibility. Those used to be 10% of the report; they are becoming 60%.

What's the single biggest data risk in a vibe-coded target?+

Public exposure of server-side secrets via client-side environment variables. We find live Stripe, OpenAI, or database admin keys in the public JavaScript bundle of roughly one in three vibe-coded targets we assess. It is trivially exploitable, trivially detectable, and trivially fixable — but it is almost never caught by the seller before diligence.

How should investors think about valuation discounts for vibe-coded targets?+

The discount is not for being vibe-coded. It is for the specific deficiencies you find: missing maintainer (priced as the cost of a 90-day knowledge transfer plus a senior engineer hire), missing data documentation (priced as the legal cost of producing it post-close), and missing security hygiene (priced as the actual remediation cost plus any breach-notification risk). Apply the discount line by line, not as a blanket multiple haircut.

What changes for sellers preparing for diligence?+

Three concrete pre-diligence actions: rotate every credential the AI ever touched, audit the deployed bundle for leaked secrets, and produce a one-page data flow diagram listing every third party that processes customer data. Doing those three things before the buyer's engineer arrives can preserve 20–40% of enterprise value that would otherwise be negotiated away.

If you're reading this as…

Private Equity

See the PE-tailored path →

Corp Dev

See the corp-dev path →

Founders

See the sell-side path →

Related guides

Tech Due Diligence

Tech Due Diligence on Vibe-Coded Apps — Buyer's Playbook

How to run technical diligence on apps built primarily with AI coding tools (Lovable, Cursor, v0, Bolt, Replit Agent, Claude Code). What's actually being acquired, where the real risk sits, and the questions a vibe-coded target will struggle to answer.

Tech Due Diligence

Tech Due Diligence on an AI Startup — Buyer's Playbook

How to run technical diligence on an AI-first startup. Training data, model moat, inference economics, vendor lock-in, and the questions the seller will dodge.

Tech Due Diligence

Tech Due Diligence Red Flags — The Top 12

Twelve technical diligence red flags that consistently predict repricings, walked deals, and post-close write-downs in SaaS and AI acquisitions.

Tech Due Diligence

Cybersecurity Due Diligence: A Focused Approach

Undertaking effective cybersecurity due diligence within a constrained timeframe requires a precise methodology. This article outlines critical areas of focus for a 2-4 week technical due diligence period, encompassing identity management, network perimeters, code integrity, third-party risk, incident history, and ransomware exposure. We also highlight key red flags that demand immediate attention.

About the author

Hutton Henry

Founder, Beyond M&A · Creator, Lens

Twenty years inside tech due diligence, integration and AI-native deal tooling. Built and exited tech businesses, led Tech DD on 150+ deals across PE, corp dev and strategic buyers, and now ships Lens — an AI workspace for diligence teams.

150+ Tech DD engagementsFounder, Beyond M&ACreator, Lens (AI for diligence)

Why this matters for investors, not just engineers

What Tech DD looks like in 2027

1. Data lineage — the new code review

2. Provenance and intelligibility

3. Operator dependency and the human factor

4. Defensibility of the system, not the code

What founders are getting wrong about vibe coding

Misconception 1: "The AI doesn't keep my code or data"

Misconception 2: "My secrets are safe because they're in environment variables"

Misconception 3: "My data is in my database, which is mine"

Where the data actually goes

How the investor playbook changes

The bigger picture

Where to start

Frequently asked

See Lens against your live data room