Add pg_dump style: ruleutils-layout renderer for views and functions#13
Conversation
Introduce a new "pg_dump" style that reproduces PostgreSQL's ruleutils.c deparser layout — the output of pg_get_viewdef / pg_get_functiondef. Its correctness bar is byte-idempotency: formatting genuine deparser output must return it unchanged, which makes it usable for canonicalizing/diffing catalog-dumped DDL (the pglifecycle use case). Unlike the river / left-aligned engine, PgDump uses a dedicated renderer (formatter/pgdump.rs) that re-imposes ruleutils' exact indentation (clause keywords right-aligned to column 7, targets indent 4, joins indent 5, set-op keywords at column 1) while reproducing each expression verbatim — the deparser has already normalized casts, parens and spacing, so the renderer folds whitespace and re-lays-out rather than re-formatting expressions. It genuinely normalizes arbitrary equivalent SQL into that layout, not just echoes input. First increment covers flat single-level views (target list, FROM with joins, WHERE, GROUP BY, HAVING, ORDER BY, set operations) and functions (CREATE FUNCTION signature + attributes + verbatim body). CTEs, scalar subqueries, multi-line CASE and DISTINCT-target indentation are deferred; unrecognized statements fall back to verbatim source (still idempotent on deparser output). Fixtures are genuine pg_get_viewdef/pg_get_functiondef captures, regenerable from a throwaway cluster via tests/fixtures/pg_dump/generate.sh (just gen-pgdump-fixtures). tests/pgdump_idempotency_test.rs asserts each round-trips byte-identically; _deferred/ fixtures mark next-increment targets. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the ruleutils renderer depth-aware (each nesting level adds 8 columns) and add three nested constructs, all reverse-engineered from genuine deparser output and verified by byte-idempotency: - WITH / CTEs: each common table expression renders as `name AS ( <body> )` with the body at depth+1 and the closing paren at 8*(depth+1); subsequent CTEs continue on the closing-paren line (`), y AS (`). - Multi-line CASE targets: rendered as blocks at column 8+8*depth (WHEN/ELSE at 12+8*depth, END back at 8); a CASE as the first target forces SELECT onto its own line. Probing showed DISTINCT does not change continuation indent — the block indent comes from the CASE itself. - Comma-separated FROM items continue at 4+8*depth with commas, distinct from JOIN steps at 5+8*depth. Fixtures regenerated from the (now authoritative) generate.sh: recent_cte and distinct_case promoted out of _deferred, plus new two_cte, case_plain and case_first. Scalar subqueries embedded in expressions remain deferred (their layout is output-column-relative); the sub fixture stays in _deferred. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Handle IN (SELECT …), EXISTS (…) and scalar subqueries inside WHERE / HAVING / target expressions — the last deferred view construct. ruleutils lays these out relative to the output column: the subquery's SELECT sits inline right after the open paren, while its own clauses align at the deeper (depth+1) river column. render_expr_text() walks an expression, reproduces everything outside a subquery verbatim (whitespace collapsed, boundary spaces preserved), and for each embedded select_with_parens renders the body at depth+1 with its first line de-indented, splicing it back in. With no embedded subquery it is just the prior collapse_ws path, so flat expressions are unchanged. Fixtures (regenerated from generate.sh): sub promoted out of _deferred, plus new sub_exists and sub_scalar. The _deferred directory is now empty and removed. Remaining gaps: subqueries in the FROM list (derived tables), in GROUP BY / ORDER BY items, or inside CASE arms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the remaining common view gaps in the ruleutils renderer: - LIMIT / OFFSET: previously collected but never emitted (silently dropped). ruleutils emits OFFSET before LIMIT, both at the SELECT keyword's column. - Clause ordering for set operations: ORDER BY / OFFSET / LIMIT apply to the whole UNION/INTERSECT/EXCEPT, so they now render after the right-hand select rather than before it. For a plain query the order is unchanged. - FROM derived tables `( SELECT … ) alias` and subqueries on a JOIN's right side: generalized the subquery splicer to operate over an arbitrary byte range (splice_range), so render_table_ref and JOIN steps reuse the same inline-at-depth+1 rendering as WHERE/target expressions. New fixtures (regenerated from generate.sh): lim, derived, derived_join, union_order. Remaining gaps: subqueries inside GROUP BY items or CASE arms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A broad sweep of 20 diverse view shapes (FILTER, GROUPING SETS, window frames, array subscripts/slices, COALESCE/NULLIF, casts, simple-form CASE, LATERAL, DISTINCT ON, BETWEEN, ROW(), string_agg with ORDER BY, set-returning functions, nested subqueries) found one gap: a scalar subquery nested two levels deep renders as `(( SELECT … ))`, where the deparser nests one select_with_parens directly inside another. render_select_with_parens looked for select_no_parens, missed it, and collapsed to one line. It now recurses through the redundant grouping paren so only the innermost subquery gets the `( SELECT` form and the depth increment; the other 19 shapes already round-tripped. Locked five high-value shapes from the sweep as fixtures: nested_sub, lateral, window_frame, distinct_on, filter_agg (23 fixtures total). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A function-variety sweep (DEFAULT args, OUT params, VARIADIC, RETURNS TABLE/SETOF, behavior attributes, SET, SQL-standard RETURN body) found three gaps in the CREATE FUNCTION renderer: - RETURNS TABLE(...) / SETOF were dropped: the return type is no longer a func_return node. Render the whole RETURNS span (signature end → option list start) uniformly, which covers scalar, SETOF and TABLE forms. - Behavior attributes (volatility, LEAKPROOF, STRICT, SECURITY, PARALLEL, WINDOW) were emitted one per line; pg_get_functiondef groups them on a single line. Group consecutive such opt-items, keeping LANGUAGE / SET / COST / ROWS / SUPPORT / AS on their own lines. - SQL-standard `RETURN expr` bodies (LANGUAGE sql without AS) were dropped: emit opt_routine_body at column 0. Eight function shapes locked as fixtures (func_default, func_out, func_variadic, func_table, func_setof, func_behavior, func_set, func_return); 31 fixtures total. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 9 minutes and 5 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (39)
Comment |
|
@coderabbitai review |
✅ Action performedReview finished.
|
Summary
Adds a new
Style::PgDump("pg_dump"/"pgdump"/"postgres") that reproduces PostgreSQL'sruleutils.cdeparser layout — the output ofpg_get_viewdef/pg_get_functiondef. Unlike the seven river/left-aligned styles, its correctness bar is byte-idempotency: feeding genuine deparser output throughformat(sql, Style::PgDump)returns it unchanged. This makes it usable for canonicalizing and diffing catalog-dumped DDL (the pglifecycle use case).A dedicated renderer (
src/formatter/pgdump.rs) re-imposes ruleutils' exact indentation while reproducing each expression's text faithfully (the deparser has already normalized casts, parens and spacing). It genuinely normalizes arbitrary equivalent SQL into that layout — not just echoes input — so two differently-formatted-but-equivalent queries converge.Coverage
Views — target lists, FROM with all JOIN forms, WHERE/GROUP BY/HAVING/ORDER BY,
LIMIT/OFFSET, set operations (UNION/INTERSECT/EXCEPT) with correct trailing-clause ordering, CTEs (WITH, depth-nested), multi-lineCASEblocks, comma-separated FROM, subqueries embedded in WHERE/HAVING/target expressions / FROM derived tables / JOIN right side (including nested grouping parens),DISTINCT ON,FILTER, window frames,LATERAL.Functions —
CREATE FUNCTIONsignature,RETURNSscalar/SETOF/TABLE(...),DEFAULT/OUT/VARIADICargs, grouped behavior attributes (volatility/LEAKPROOF/STRICT/SECURITY/PARALLELon one line),SET, verbatimASbodies (SQL and PL/pgSQL), and SQL-standardRETURN exprbodies.Layout rules (depth
d,STEP = 8):SELECT/WITHkeywords start at column2 + 8d; other clause keywords right-align to end at column7 + 8d; target/comma-FROM continuations indent4 + 8d; JOIN steps5 + 8d; CASE blocks at8 + 8d; CTE bodies render atd+1. Statements the renderer doesn't recognize fall back to verbatim source (still idempotent).Testing
tests/pgdump_idempotency_test.rsasserts every fixture round-trips byte-identically underStyle::PgDump.pg_get_viewdef/pg_get_functiondefcaptures, regenerable from a throwaway local cluster viatests/fixtures/pg_dump/generate.sh(just gen-pgdump-fixtures); committed so CI needs no PostgreSQL.just check(fmt + clippy-D warnings+ all suites) passes; the existing 7 styles are unaffected.Scope / follow-ups
First-class views + functions, per the agreed scope. Known niche gaps: subqueries nested inside GROUP BY items or CASE arms. Wiring pglifecycle to request
Style::PgDump, and releasing libpgfmt/pgfmt with the new style, are intentionally out of scope here.🤖 Generated with Claude Code