Don't let AI touch your production database

Not so long ago, the biggest threat to production databases was the developer who claimed it worked on their machine. If you've attended my sessions, you know this is a topic I'm particularly sensitive to.

These days, AI agents are writing your SQL. The models are getting incredibly good at producing plausible code. It looks right, it feels right, and often it passes a cursory glance. But "plausible" isn't a performance metric, and it doesn't care about your execution plan or locking strategy.

AI-generated SQL is syntactically correct, which is the easy part. The hard part is knowing what a statement does to a running system: which locks it takes, how long it holds them, whether it rewrites the table on disk.

The spectacular failures get the headlines. In July 2025, an AI coding agent wiped a production database during a code freeze -- ran destructive commands, panicked, then lied about what it had done.

But the biggest damage is quieter. It's the migration that passes every test, ships through CI and then locks a production table during peak traffic. The query written with random assumptions. Indexes added based on copy/paste from psql. The cumulative effect that builds over time, when nobody is looking.

How to give AI agent "eyes"

When you're running Claude Code or any other agentic coding tool, the bottleneck isn't the model's intelligence. It's the fidelity of the environment. Standard AI coding involves the agent guessing column names based on your description, or in better cases parsing a schema.sql file or using a local database with seed data.

None of these give the agent the one thing it actually needs: awareness of your production schema. The table sizes, the indexes, the constraints, the statistics that determine whether a query index-scans in 2ms or sequential-scans for 40 seconds.

The obvious fix is to give it a database connection. Let it query pg_catalog, read table sizes, check existing indexes. This is how Anthropic's reference PostgreSQL MCP server worked, and that's the one Datadog Security Labs found a SQL injection in. The vulnerable npm package still gets 21,000 downloads per week. The server wrapped queries in read-only transactions, but accepted semicolons. COMMIT; DROP SCHEMA public CASCADE; works because the COMMIT ends the read-only transaction and everything after runs with full privileges. Read-only transactions are not a security boundary when the protocol accepts semicolons.

But here's the thing: the AI doesn't need a connection. It needs the information. Schema metadata is structural data. You can extract it once, save it as a JSON file, commit it to git, and hand that file to every tool in your stack.

MCP is an open standard for connecting AI assistants to external tools and data sources. Instead of relying on training data, the agent calls structured tools exposed by an MCP server and gets deterministic answers back.

What's been missing is a standard way to wire that information into the agent's workflow. Not a one-time dump that sits in a docs folder, but a protocol that lets the agent call tools and get structured answers back. That protocol is MCP (Model Context Protocol).

What are PostgreSQL MCP servers actually for?

This has been bugging me. If you search for "PostgreSQL MCP server" today, you'll find two kinds.

The first is a connection wrapper. You give it a DATABASE_URL, it exposes tools like query and execute. This is psql with extra steps. The Anthropic reference server that got the SQL injection? Connection wrapper. The dozen community forks that replaced it? Connection wrappers with slightly better input validation.

The second is a managed-provider gateway. Cloud database platforms expose their platform APIs through MCP. Create a branch, provision a database, manage users. Useful if you're on that platform, but not database intelligence. Control-plane operations wearing a protocol hat.

Neither kind answers the question an AI agent actually needs answered: "is this SQL going to be a problem?"

A connection wrapper can run EXPLAIN, if you trust it with credentials. But who tells the agent that SET NOT NULL on a 4-million-row table takes an ACCESS EXCLUSIVE lock? Who tells it that its query defeats partition pruning? That's not a connection problem. It's a knowledge problem.

dryrun

I didn't set out to build an AI tool. My primary project is RegreSQL, a statistics-based SQL regression testing framework. RegreSQL needs production statistics to produce realistic query plans in test environments, and dryrun's schema capture was originally just the plumbing to get those statistics into the right place.

But driving adoption of a testing tool is hard. The best place to plug in is where the coding actually happens. And these days, that's increasingly AI agents. Instead of fighting it, I turned the plumbing into a proper tool.

dryrun is a PostgreSQL schema intelligence tool. It ships as a CLI and as an MCP server. It doesn't connect to your database at runtime. Instead, it works from a JSON snapshot of your schema, captured once, committed to git, and available to everyone.

Offline first

One person with production access captures the schema:

dryrun dump-schema --source "$PROD_URL" --pretty --name "production" -o schema.json

The JSON contains everything structural: tables, columns, constraints, indexes, and the rest of what you'd find in pg_catalog. It also captures per-table statistics from pg_statistic (histograms, most common values, correlation, n_distinct) and runtime counters from pg_class like reltuples, seq_scan, and idx_scan. One file, typically 200-500KB. Structural metadata, not row data.

If your schema uses COMMENT ON (and it should), dryrun captures those too. A column named status is ambiguous. COMMENT ON COLUMN orders.status IS 'pending | completed | refunded, set by payment processor webhook' tells the agent what the values mean, where they come from, and that it shouldn't invent new ones.

If you don't take my word for it, check what Bruce Momjian has to say about it. MCP clients can only work with schemas, object names, and constraints. Comments are the simplest way to supply the semantic context that makes AI-generated queries meaningful.

Import it and every subsequent command works offline:

dryrun import schema.json

No connection needed, no credentials on developer machines or in CI.

As an MCP server

claude mcp add dryrun -- dryrun mcp-serve

Claude now has 16 tools for exploring your schema, validating queries, checking migrations, and suggesting improvements, all from the offline snapshot.

What the agent gets

dryrun exposes 16 tools through MCP. The four that matter most for the argument in this article:

check_migration is the one I find myself using most often, even outside of AI workflows. It analyzes DDL statements for lock implications, table rewrites, and version-specific behavior, and returns the safe multi-step alternative when one exists. Every migration example in this article was caught by this tool.

suggest_index reads column statistics from the snapshot and recommends indexes informed by your actual data distribution. Not "add an index on this column" but "add a partial index on the 3% of rows that match your query pattern." It also uses correlation data to recommend BRIN indexes where appropriate: 3MB instead of 400MB for a B-tree on a column with near-perfect insertion order.

validate_query parses SQL against the schema snapshot without executing it. It catches references to tables and columns that don't exist, detects anti-patterns, and flags issues before the query goes anywhere near the database.

lint_schema runs 18 convention rules and 14 structural audit checks: naming, types, constraints, timestamps, partition health, duplicate indexes, foreign key columns missing indexes, tables without primary keys. All disableable in dryrun.toml, because not every convention fits every project.

The remaining tools cover schema exploration (list_tables, describe_table, search_schema, find_related), operations (compare_nodes, detect, vacuum_health), execution plans (explain_query), and schema maintenance (refresh_schema, check_drift). The full list is in the documentation.

Try it yourself

Here's a schema and seed data you can run in any local PostgreSQL:

CREATE TABLE customers (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name text NOT NULL,
    email text,
    created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE orders (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id bigint NOT NULL,
    status text NOT NULL DEFAULT 'pending',
    total_amount numeric(10,2),
    created_at timestamptz NOT NULL DEFAULT now()
);

Seed it with data that has the distributions that matter, like skewed status values and nullable emails:

-- 100K customers, ~5% with NULL email
INSERT INTO customers (name, email, created_at)
SELECT
    'customer_' || i,
    CASE WHEN random() < 0.05 THEN NULL
         ELSE 'user' || i || '@example.com'
    END,
    '2023-01-01'::timestamptz + (i || ' seconds')::interval
FROM generate_series(1, 100000) AS i;

-- 1M orders, 97% completed / 3% pending
INSERT INTO orders (customer_id, status, total_amount, created_at)
SELECT
    1 + (random() * 99999)::int,
    CASE WHEN random() < 0.03 THEN 'pending'
         ELSE 'completed'
    END,
    (random() * 500)::numeric(10,2),
    '2024-01-01'::timestamptz + (i || ' seconds')::interval
FROM generate_series(1, 1000000) AS i;

ANALYZE;

Now capture the schema for dryrun and install it as an MCP server:

dryrun dump-schema --source "$DATABASE_URL" -o schema.json
dryrun init
dryrun import schema.json
claude mcp add dryrun -- dryrun mcp-serve

Open Claude Code and try these prompts. First without the dryrun MCP server, then with it.

"Add an index to speed up queries that filter orders by status"

Without dryrun, Claude sees a text column and does the obvious thing:

CREATE INDEX idx_orders_status ON orders (status);

A full B-tree index on 1 million rows. It works, but 97% of those rows are 'completed' and nobody queries for completed orders.

With dryrun, Claude calls suggest_index, sees most_common_freqs showing 'completed' at 97% and 'pending' at 3%, and suggests:

CREATE INDEX idx_orders_status_pending ON orders (status)
    WHERE status = 'pending';

The full index maintains 1 million entries. The partial index maintains 30,000. Every INSERT that creates a 'completed' order no longer pays for index maintenance it'll never use.

"Make the email column on customers required"

Without dryrun, Claude writes a single statement and moves on:

ALTER TABLE customers ALTER COLUMN email SET NOT NULL;

This fails immediately because 5% of rows have NULL emails. Claude didn't know the data existed. Even if the column had no NULLs, the statement would still take an ACCESS EXCLUSIVE lock and perform a full table scan to verify.

With dryrun, Claude calls check_migration and describe_table. It sees 100,000 rows, knows the column is nullable, and knows from the statistics that NULLs exist. It generates a three-step migration:

-- Step 1: backfill NULLs
UPDATE customers
    SET email = 'unknown-' || id || '@example.com'
    WHERE email IS NULL;

-- Step 2: add constraint without full table scan
ALTER TABLE customers
    ADD CONSTRAINT customers_email_not_null
    CHECK (email IS NOT NULL) NOT VALID;

-- Step 3: validate in the background with a weaker lock
ALTER TABLE customers
    VALIDATE CONSTRAINT customers_email_not_null;

The first step handles the data. The second adds the constraint without scanning existing rows. The third validates with a weaker lock that allows concurrent reads and writes. Three statements instead of one.

Keeping the snapshot fresh

dryrun is only as good as your last snapshot. If the schema changed yesterday and nobody re-dumped, the agent is working from stale data. The check_drift tool can detect this, but only if someone runs it.

The best approach is to treat schema snapshots like you treat production backups: automate them. If you're already capturing production schema dumps on a schedule, add dryrun to the same pipeline. A nightly dump-schema from the primary gives you current structure and statistics. If you run read replicas with different workloads, add --stats-only dumps from each node for higher-fidelity statistics across the cluster. Commit the resulting schema.json to your repository so every developer and every AI agent starts the day with a current view.

The snapshot also serves as a lightweight audit trail. If someone adds a column through psql without a migration, the next scheduled dump captures it, and the diff in git shows exactly what changed and when.

What dryrun is not

dryrun does not execute arbitrary SQL against your database. It does not manage migrations. It does not provision infrastructure. It is not an ORM, a query builder, or a database admin panel. It's read-only and narrow by design, trustworthy enough that you'd let an AI agent use it unsupervised.

What's next

dryrun ships today at v0.5.0. The next releases are focused on:

Shadow EXPLAIN (v0.6.0): inject production statistics into a local PostgreSQL instance so EXPLAIN produces realistic plans from your laptop, without any production data. Currently explain_query runs against whatever local database you point it at, but if that database has 10 rows in every table, the planner gives you toy plans that tell you nothing about production behavior.
CI/CD Trust Gate: a single command that takes a migration file and runs safety analysis, schema diff, lint, and index suggestions in one pass. Designed to block unsafe migrations before they merge.
Improved migration support: deeper analysis of multi-statement migrations, version-specific advice, and integration with common migration frameworks.

Getting started

dryrun is MIT licensed and available on GitHub. The tutorial covers installation, offline and online workflows, multi-node setups, and MCP integration. It's part of the boringSQL suite.