added a macro that cleans stale test temp tables. Will be usefull for…#1022
Conversation
… when the cleanup process takes too long in the on_run_end, and then a solution can be to set clean_elementary_temp_tables to false and use this macro in a dedicated job instead
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds adapter-dispatched stale-table discovery, refactors cleanup macros into current-invocation and parameterized cleanup, updates the system caller to the new entrypoint, and adds integration macros and a Python test that create and remove temporary test tables and report before/after counts. ChangesStale Test Table Cleanup Across Adapters
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
👋 @GuyEshdat |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Around line 34-41: The macro cleanup_stale_test_tables currently accepts a
negative hours value which inverts the stale-window; add input validation at the
start of the macro (inside cleanup_stale_test_tables) to ensure the hours
parameter is non-negative: if hours is negative, raise/abort with a clear error
message (or coerce to 0) before calling elementary.get_stale_test_tables;
reference the macro name cleanup_stale_test_tables and the hours parameter so
reviewers can find and update the validation logic.
- Around line 39-54: The stale-table list returned by
elementary.get_stale_test_tables may contain raw strings (e.g.,
"database/schema/identifier") which break Athena DROP SQL; before calling
elementary.clean_elementary_test_tables(tables) normalize each item in tables
into a relation object with explicit fields (database, schema, identifier) or a
properly escaped/formatted relation string expected by the cleaner. Update the
code around the tables variable (result of elementary.get_stale_test_tables) to
detect string entries, parse them into {database, schema, identifier} (or
reformat/escape the path) and build a sanitized list, then pass that sanitized
list to elementary.clean_elementary_test_tables; ensure entries already in
proper structured form are left unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6f3294b0-6259-4b4c-86f7-427d76f8a3c9
📒 Files selected for processing (5)
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sqlintegration_tests/tests/test_cleanup_stale_test_tables.pymacros/edr/system/system_utils/clean_elementary_temp_tables.sqlmacros/edr/tests/test_utils/clean_elementary_test_tables.sqlmacros/edr/tests/test_utils/get_stale_test_tables.sql
…test-fixes iteration, will revert this commit once this test passes for all warehouses
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/test-warehouse.yml:
- Around line 279-282: Remove the temporary test filter that scopes pytest to a
single test: delete the -k "test_cleanup_stale_test_tables" argument (and the
preceding TODO comment) from the py.test invocation (the line containing py.test
-n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE") so the workflow runs the full
test suite and restores CI coverage across warehouses and versions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 73979cab-7d7c-4328-868d-48a49b15f0c8
📒 Files selected for processing (1)
.github/workflows/test-warehouse.yml
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/edr/tests/test_utils/get_stale_test_tables.sql`:
- Line 199: The SELECT currently builds a two-part name using the literal '{{
elementary_schema }}' and table_name; change it to produce a fully-qualified
three-part name using table_catalog (if present in v_catalog.tables) or the
current database name (e.g., current_database()), plus the table_schema column
(not the '{{ elementary_schema }}' literal), and table_name so the cleanup gets
database.schema.table; update the SELECT in get_stale_test_tables.sql to prefer
v_catalog.tables.table_catalog when available, fall back to querying the current
database, and concatenate table_catalog || '.' || table_schema || '.' ||
table_name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 62ed937f-a49e-40f5-8036-de535a202fab
📒 Files selected for processing (3)
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sqlintegration_tests/tests/test_cleanup_stale_test_tables.pymacros/edr/tests/test_utils/get_stale_test_tables.sql
🚧 Files skipped from review as they are similar to previous changes (2)
- integration_tests/tests/test_cleanup_stale_test_tables.py
- integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql
…ables Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…le_schema stores full path Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…eration) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…er, restore -k CI flag Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ime filtering Warehouses without creation-time support (postgres, redshift, athena, trino, duckdb) are removed and skipped in tests — without a time filter the macro could delete temp tables actively in use by a concurrent dbt run. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Problem
When a pipeline runs many Elementary tests, Elementary creates temporary tables during test execution and cleans them up in
on_run_end. With a large number of tests (e.g. many sources monitored withelementary_source_schema_changes), this cleanup step can add significant time to every pipeline run.A solution is to disable the automatic cleanup and run it in a dedicated scheduled job instead:
Step 1:
clean_elementary_temp_tablesis an existing flag — set it tofalseindbt_project.yml:Step 2: In a separate scheduled job, run:
dbt run-operation elementary.cleanup_stale_test_tables --args '{"hours": 24}'What this PR adds
cleanup_stale_test_tables(hours, limit)— new operation that finds and drops temp test tables older than N hours, with an optional row limit (default 2000) to cap each runget_stale_test_tables— adapter-dispatched macro that queries the warehouse catalog for matching tables; implemented for Snowflake, BigQuery, Databricks, Fabric, SQL Server, Vertica, and ClickHouse (adapters without creation-time support are excluded to avoid accidentally deleting tables from active runs)clean_elementary_test_tablesto accept an explicit relation list, enabling reuse from bothon_run_endand the new cleanup operation