Skip to content

Add pg_dump style: ruleutils-layout renderer for views and functions#13

Merged
gmr merged 6 commits into
mainfrom
feature/pgdump-renderer
Jun 15, 2026
Merged

Add pg_dump style: ruleutils-layout renderer for views and functions#13
gmr merged 6 commits into
mainfrom
feature/pgdump-renderer

Conversation

@gmr

@gmr gmr commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Summary

Adds a new Style::PgDump ("pg_dump" / "pgdump" / "postgres") that reproduces PostgreSQL's ruleutils.c deparser layout — the output of pg_get_viewdef / pg_get_functiondef. Unlike the seven river/left-aligned styles, its correctness bar is byte-idempotency: feeding genuine deparser output through format(sql, Style::PgDump) returns it unchanged. This makes it usable for canonicalizing and diffing catalog-dumped DDL (the pglifecycle use case).

A dedicated renderer (src/formatter/pgdump.rs) re-imposes ruleutils' exact indentation while reproducing each expression's text faithfully (the deparser has already normalized casts, parens and spacing). It genuinely normalizes arbitrary equivalent SQL into that layout — not just echoes input — so two differently-formatted-but-equivalent queries converge.

Coverage

Views — target lists, FROM with all JOIN forms, WHERE/GROUP BY/HAVING/ORDER BY, LIMIT/OFFSET, set operations (UNION/INTERSECT/EXCEPT) with correct trailing-clause ordering, CTEs (WITH, depth-nested), multi-line CASE blocks, comma-separated FROM, subqueries embedded in WHERE/HAVING/target expressions / FROM derived tables / JOIN right side (including nested grouping parens), DISTINCT ON, FILTER, window frames, LATERAL.

FunctionsCREATE FUNCTION signature, RETURNS scalar/SETOF/TABLE(...), DEFAULT/OUT/VARIADIC args, grouped behavior attributes (volatility/LEAKPROOF/STRICT/SECURITY/PARALLEL on one line), SET, verbatim AS bodies (SQL and PL/pgSQL), and SQL-standard RETURN expr bodies.

Layout rules (depth d, STEP = 8): SELECT/WITH keywords start at column 2 + 8d; other clause keywords right-align to end at column 7 + 8d; target/comma-FROM continuations indent 4 + 8d; JOIN steps 5 + 8d; CASE blocks at 8 + 8d; CTE bodies render at d+1. Statements the renderer doesn't recognize fall back to verbatim source (still idempotent).

Testing

  • tests/pgdump_idempotency_test.rs asserts every fixture round-trips byte-identically under Style::PgDump.
  • 31 fixtures are genuine pg_get_viewdef / pg_get_functiondef captures, regenerable from a throwaway local cluster via tests/fixtures/pg_dump/generate.sh (just gen-pgdump-fixtures); committed so CI needs no PostgreSQL.
  • The construct coverage was driven by two cluster validation sweeps (20 view shapes + 10 function shapes) feeding real deparser output through the renderer; the gaps they surfaced were fixed and the high-value shapes locked as fixtures.
  • Full just check (fmt + clippy -D warnings + all suites) passes; the existing 7 styles are unaffected.

Scope / follow-ups

First-class views + functions, per the agreed scope. Known niche gaps: subqueries nested inside GROUP BY items or CASE arms. Wiring pglifecycle to request Style::PgDump, and releasing libpgfmt/pgfmt with the new style, are intentionally out of scope here.

🤖 Generated with Claude Code

gmr and others added 6 commits June 15, 2026 16:20
Introduce a new "pg_dump" style that reproduces PostgreSQL's ruleutils.c
deparser layout — the output of pg_get_viewdef / pg_get_functiondef. Its
correctness bar is byte-idempotency: formatting genuine deparser output must
return it unchanged, which makes it usable for canonicalizing/diffing
catalog-dumped DDL (the pglifecycle use case).

Unlike the river / left-aligned engine, PgDump uses a dedicated renderer
(formatter/pgdump.rs) that re-imposes ruleutils' exact indentation
(clause keywords right-aligned to column 7, targets indent 4, joins indent 5,
set-op keywords at column 1) while reproducing each expression verbatim — the
deparser has already normalized casts, parens and spacing, so the renderer
folds whitespace and re-lays-out rather than re-formatting expressions. It
genuinely normalizes arbitrary equivalent SQL into that layout, not just
echoes input.

First increment covers flat single-level views (target list, FROM with joins,
WHERE, GROUP BY, HAVING, ORDER BY, set operations) and functions (CREATE
FUNCTION signature + attributes + verbatim body). CTEs, scalar subqueries,
multi-line CASE and DISTINCT-target indentation are deferred; unrecognized
statements fall back to verbatim source (still idempotent on deparser output).

Fixtures are genuine pg_get_viewdef/pg_get_functiondef captures, regenerable
from a throwaway cluster via tests/fixtures/pg_dump/generate.sh
(just gen-pgdump-fixtures). tests/pgdump_idempotency_test.rs asserts each
round-trips byte-identically; _deferred/ fixtures mark next-increment targets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the ruleutils renderer depth-aware (each nesting level adds 8 columns) and
add three nested constructs, all reverse-engineered from genuine deparser
output and verified by byte-idempotency:

- WITH / CTEs: each common table expression renders as `name AS ( <body> )`
  with the body at depth+1 and the closing paren at 8*(depth+1); subsequent
  CTEs continue on the closing-paren line (`), y AS (`).
- Multi-line CASE targets: rendered as blocks at column 8+8*depth (WHEN/ELSE at
  12+8*depth, END back at 8); a CASE as the first target forces SELECT onto its
  own line. Probing showed DISTINCT does not change continuation indent — the
  block indent comes from the CASE itself.
- Comma-separated FROM items continue at 4+8*depth with commas, distinct from
  JOIN steps at 5+8*depth.

Fixtures regenerated from the (now authoritative) generate.sh: recent_cte and
distinct_case promoted out of _deferred, plus new two_cte, case_plain and
case_first. Scalar subqueries embedded in expressions remain deferred (their
layout is output-column-relative); the sub fixture stays in _deferred.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Handle IN (SELECT …), EXISTS (…) and scalar subqueries inside WHERE / HAVING /
target expressions — the last deferred view construct. ruleutils lays these out
relative to the output column: the subquery's SELECT sits inline right after the
open paren, while its own clauses align at the deeper (depth+1) river column.

render_expr_text() walks an expression, reproduces everything outside a
subquery verbatim (whitespace collapsed, boundary spaces preserved), and for
each embedded select_with_parens renders the body at depth+1 with its first
line de-indented, splicing it back in. With no embedded subquery it is just the
prior collapse_ws path, so flat expressions are unchanged.

Fixtures (regenerated from generate.sh): sub promoted out of _deferred, plus new
sub_exists and sub_scalar. The _deferred directory is now empty and removed.
Remaining gaps: subqueries in the FROM list (derived tables), in GROUP BY /
ORDER BY items, or inside CASE arms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the remaining common view gaps in the ruleutils renderer:

- LIMIT / OFFSET: previously collected but never emitted (silently dropped).
  ruleutils emits OFFSET before LIMIT, both at the SELECT keyword's column.
- Clause ordering for set operations: ORDER BY / OFFSET / LIMIT apply to the
  whole UNION/INTERSECT/EXCEPT, so they now render after the right-hand select
  rather than before it. For a plain query the order is unchanged.
- FROM derived tables `( SELECT … ) alias` and subqueries on a JOIN's right
  side: generalized the subquery splicer to operate over an arbitrary byte
  range (splice_range), so render_table_ref and JOIN steps reuse the same
  inline-at-depth+1 rendering as WHERE/target expressions.

New fixtures (regenerated from generate.sh): lim, derived, derived_join,
union_order. Remaining gaps: subqueries inside GROUP BY items or CASE arms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A broad sweep of 20 diverse view shapes (FILTER, GROUPING SETS, window frames,
array subscripts/slices, COALESCE/NULLIF, casts, simple-form CASE, LATERAL,
DISTINCT ON, BETWEEN, ROW(), string_agg with ORDER BY, set-returning functions,
nested subqueries) found one gap: a scalar subquery nested two levels deep
renders as `(( SELECT … ))`, where the deparser nests one select_with_parens
directly inside another. render_select_with_parens looked for select_no_parens,
missed it, and collapsed to one line. It now recurses through the redundant
grouping paren so only the innermost subquery gets the `( SELECT` form and the
depth increment; the other 19 shapes already round-tripped.

Locked five high-value shapes from the sweep as fixtures: nested_sub, lateral,
window_frame, distinct_on, filter_agg (23 fixtures total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A function-variety sweep (DEFAULT args, OUT params, VARIADIC, RETURNS
TABLE/SETOF, behavior attributes, SET, SQL-standard RETURN body) found three
gaps in the CREATE FUNCTION renderer:

- RETURNS TABLE(...) / SETOF were dropped: the return type is no longer a
  func_return node. Render the whole RETURNS span (signature end → option list
  start) uniformly, which covers scalar, SETOF and TABLE forms.
- Behavior attributes (volatility, LEAKPROOF, STRICT, SECURITY, PARALLEL,
  WINDOW) were emitted one per line; pg_get_functiondef groups them on a single
  line. Group consecutive such opt-items, keeping LANGUAGE / SET / COST / ROWS /
  SUPPORT / AS on their own lines.
- SQL-standard `RETURN expr` bodies (LANGUAGE sql without AS) were dropped:
  emit opt_routine_body at column 0.

Eight function shapes locked as fixtures (func_default, func_out, func_variadic,
func_table, func_setof, func_behavior, func_set, func_return); 31 fixtures total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@gmr, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 9 minutes and 5 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 24aa7537-79c2-4f07-9f42-273fa82e32e9

📥 Commits

Reviewing files that changed from the base of the PR and between 2417c7b and 8b196ce.

📒 Files selected for processing (39)
  • Justfile
  • src/formatter/mod.rs
  • src/formatter/pgdump.rs
  • src/formatter/select.rs
  • src/style.rs
  • tests/fixtures/pg_dump/func_add.sql
  • tests/fixtures/pg_dump/func_behavior.sql
  • tests/fixtures/pg_dump/func_bump.sql
  • tests/fixtures/pg_dump/func_default.sql
  • tests/fixtures/pg_dump/func_out.sql
  • tests/fixtures/pg_dump/func_return.sql
  • tests/fixtures/pg_dump/func_set.sql
  • tests/fixtures/pg_dump/func_setof.sql
  • tests/fixtures/pg_dump/func_table.sql
  • tests/fixtures/pg_dump/func_variadic.sql
  • tests/fixtures/pg_dump/generate.sh
  • tests/fixtures/pg_dump/view_case_first.sql
  • tests/fixtures/pg_dump/view_case_plain.sql
  • tests/fixtures/pg_dump/view_derived.sql
  • tests/fixtures/pg_dump/view_derived_join.sql
  • tests/fixtures/pg_dump/view_distinct_case.sql
  • tests/fixtures/pg_dump/view_distinct_on.sql
  • tests/fixtures/pg_dump/view_filter_agg.sql
  • tests/fixtures/pg_dump/view_lateral.sql
  • tests/fixtures/pg_dump/view_lim.sql
  • tests/fixtures/pg_dump/view_nested_sub.sql
  • tests/fixtures/pg_dump/view_order_totals.sql
  • tests/fixtures/pg_dump/view_recent_cte.sql
  • tests/fixtures/pg_dump/view_sub.sql
  • tests/fixtures/pg_dump/view_sub_exists.sql
  • tests/fixtures/pg_dump/view_sub_scalar.sql
  • tests/fixtures/pg_dump/view_two_cte.sql
  • tests/fixtures/pg_dump/view_uni.sql
  • tests/fixtures/pg_dump/view_union_order.sql
  • tests/fixtures/pg_dump/view_us_users.sql
  • tests/fixtures/pg_dump/view_win.sql
  • tests/fixtures/pg_dump/view_window_frame.sql
  • tests/fixtures_test.rs
  • tests/pgdump_idempotency_test.rs

Comment @coderabbitai help to get the list of available commands and usage tips.

@gmr

gmr commented Jun 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@gmr gmr merged commit e469f3f into main Jun 15, 2026
3 checks passed
@gmr gmr deleted the feature/pgdump-renderer branch June 15, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant