Why RegreSQL?
SQL queries are the number one cause of database problems. Yet most teams treat SQL as a second-class citizen when it comes to testing. Unit tests mock the database. Integration tests check application logic. Nobody tests the queries themselves.
And the problem is getting worse. Developers who learned just enough SQL, ORMs generating queries you have never seen, and now LLMs writing SQL at scale. More SQL written by things that never see production, at a faster rate of change. The guardrail is not better prompts. It is regression testing.
"It works on my database" is the new "it works on my machine." We solved environment drift for code with Docker and feature flags. But the same query hitting different data produces different results. Your laptop has 100 rows with uniform distribution. Production has 10 million rows with heavily skewed data. The PostgreSQL planner makes different choices in each environment, and code review cannot catch that.
What RegreSQL tests
RegreSQL tests two things:
- Logical correctness. Does the query return the right data? RegreSQL compares query outputs against known baselines and generates diffs that show exactly what changed.
- Performance correctness. Does it return them efficiently? Not by measuring timing, which varies wildly across machines. RegreSQL tracks buffers (pages accessed during execution), which are deterministic regardless of hardware. Same query, same data, same buffers, whether you run it on a Hetzner ARM box or an M4 Pro laptop. If your query suddenly reads 10% more buffers, that is a regression.
Production query plans without production data
PostgreSQL's planner picks execution strategies based on table statistics, available indexes, memory settings, and data patterns. Your test database with 100 rows will never produce the same plan as production with millions.
RegreSQL solves this with portable statistics. Using PostgreSQL 17's
pg_restore_relation_stats and pg_restore_attribute_stats, you can give
your test database production-like statistics without copying any actual
data. The planner sees the same row counts, distributions, and correlations
it would in production, and makes the same choices.
Read more in Portable PostgreSQL Statistics and pg_regresql: truly portable PostgreSQL statistics.