Skip to content

added a macro that cleans stale test temp tables. Will be usefull for…#1022

Merged
GuyEshdat merged 14 commits into
masterfrom
delete-stale-test-tables
Jun 9, 2026
Merged

added a macro that cleans stale test temp tables. Will be usefull for…#1022
GuyEshdat merged 14 commits into
masterfrom
delete-stale-test-tables

Conversation

@GuyEshdat

@GuyEshdat GuyEshdat commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Problem

When a pipeline runs many Elementary tests, Elementary creates temporary tables during test execution and cleans them up in on_run_end. With a large number of tests (e.g. many sources monitored with elementary_source_schema_changes), this cleanup step can add significant time to every pipeline run.

A solution is to disable the automatic cleanup and run it in a dedicated scheduled job instead:

Step 1: clean_elementary_temp_tables is an existing flag — set it to false in dbt_project.yml:

vars:
  clean_elementary_temp_tables: false

Step 2: In a separate scheduled job, run:

dbt run-operation elementary.cleanup_stale_test_tables --args '{"hours": 24}'

What this PR adds

  • cleanup_stale_test_tables(hours, limit) — new operation that finds and drops temp test tables older than N hours, with an optional row limit (default 2000) to cap each run
  • get_stale_test_tables — adapter-dispatched macro that queries the warehouse catalog for matching tables; implemented for Snowflake, BigQuery, Databricks, Fabric, SQL Server, Vertica, and ClickHouse (adapters without creation-time support are excluded to avoid accidentally deleting tables from active runs)
  • Refactored clean_elementary_test_tables to accept an explicit relation list, enabling reuse from both on_run_end and the new cleanup operation

… when the cleanup process takes too long in the on_run_end, and then a solution can be to set clean_elementary_temp_tables to false and use this macro in a dedicated job instead
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds adapter-dispatched stale-table discovery, refactors cleanup macros into current-invocation and parameterized cleanup, updates the system caller to the new entrypoint, and adds integration macros and a Python test that create and remove temporary test tables and report before/after counts.

Changes

Stale Test Table Cleanup Across Adapters

Layer / File(s) Summary
Adapter-specific stale table discovery
macros/edr/tests/test_utils/get_stale_test_tables.sql
New get_stale_test_tables adapter-dispatch macro plus per-adapter implementations; some adapters apply time-based filtering (Snowflake, BigQuery, Databricks, ClickHouse, SQL Server, Vertica), others log warnings and return matches without age filtering.
Cleanup macro refactoring and stale-table cleanup
macros/edr/tests/test_utils/clean_elementary_test_tables.sql
Adds clean_current_invocation_test_tables(), changes clean_elementary_test_tables(test_table_relations) to perform cleanup of given relations, and adds cleanup_stale_test_tables(hours=24) which discovers stale tables and calls the parameterized cleanup.
Update system cleanup caller
macros/edr/system/system_utils/clean_elementary_temp_tables.sql
clean_elementary_temp_tables() now calls elementary.clean_current_invocation_test_tables() instead of the parameterless cleanup.
Integration tests for stale cleanup
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql, integration_tests/tests/test_cleanup_stale_test_tables.py
Macro test_cleanup_stale_test_tables creates two temp tables, calls cleanup_stale_test_tables(hours=0), and returns counts; Python integration test runs the macro, parses JSON, asserts >=2 tables before and 0 after cleanup.
CI test filter tweak
.github/workflows/test-warehouse.yml
Pytest invocation adjusted to include -k "test_cleanup_stale_test_tables" to run only the new cleanup test in that workflow step.

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Suggested reviewers:
    • haritamar

🐰 I hopped through schemas late and bright,

Nosed out the test tables hidden from sight,
With tiny paws I tipped each stale heap,
Sent them away so the warehouses sleep,
Now clean fields gleam beneath morning light.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is incomplete and vague, ending with 'Will be usefull for…' without specifying the actual purpose or use case. Complete the title to clearly convey the full intent, e.g., 'Add macro to clean stale test temp tables for deferred cleanup' or 'Add cleanup_stale_test_tables macro for dedicated cleanup jobs'.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch delete-stale-test-tables

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

👋 @GuyEshdat
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Around line 34-41: The macro cleanup_stale_test_tables currently accepts a
negative hours value which inverts the stale-window; add input validation at the
start of the macro (inside cleanup_stale_test_tables) to ensure the hours
parameter is non-negative: if hours is negative, raise/abort with a clear error
message (or coerce to 0) before calling elementary.get_stale_test_tables;
reference the macro name cleanup_stale_test_tables and the hours parameter so
reviewers can find and update the validation logic.
- Around line 39-54: The stale-table list returned by
elementary.get_stale_test_tables may contain raw strings (e.g.,
"database/schema/identifier") which break Athena DROP SQL; before calling
elementary.clean_elementary_test_tables(tables) normalize each item in tables
into a relation object with explicit fields (database, schema, identifier) or a
properly escaped/formatted relation string expected by the cleaner. Update the
code around the tables variable (result of elementary.get_stale_test_tables) to
detect string entries, parse them into {database, schema, identifier} (or
reformat/escape the path) and build a sanitized list, then pass that sanitized
list to elementary.clean_elementary_test_tables; ensure entries already in
proper structured form are left unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f3294b0-6259-4b4c-86f7-427d76f8a3c9

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc104a and 6d92e04.

📒 Files selected for processing (5)
  • integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql
  • integration_tests/tests/test_cleanup_stale_test_tables.py
  • macros/edr/system/system_utils/clean_elementary_temp_tables.sql
  • macros/edr/tests/test_utils/clean_elementary_test_tables.sql
  • macros/edr/tests/test_utils/get_stale_test_tables.sql

Comment thread macros/edr/tests/test_utils/clean_elementary_test_tables.sql Outdated
Comment thread macros/edr/tests/test_utils/clean_elementary_test_tables.sql
GuyEshdat added 2 commits June 4, 2026 20:35
…test-fixes iteration, will revert this commit once this test passes for all warehouses

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-warehouse.yml:
- Around line 279-282: Remove the temporary test filter that scopes pytest to a
single test: delete the -k "test_cleanup_stale_test_tables" argument (and the
preceding TODO comment) from the py.test invocation (the line containing py.test
-n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE") so the workflow runs the full
test suite and restores CI coverage across warehouses and versions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 73979cab-7d7c-4328-868d-48a49b15f0c8

📥 Commits

Reviewing files that changed from the base of the PR and between fb10ced and cae3659.

📒 Files selected for processing (1)
  • .github/workflows/test-warehouse.yml

Comment thread .github/workflows/test-warehouse.yml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@macros/edr/tests/test_utils/get_stale_test_tables.sql`:
- Line 199: The SELECT currently builds a two-part name using the literal '{{
elementary_schema }}' and table_name; change it to produce a fully-qualified
three-part name using table_catalog (if present in v_catalog.tables) or the
current database name (e.g., current_database()), plus the table_schema column
(not the '{{ elementary_schema }}' literal), and table_name so the cleanup gets
database.schema.table; update the SELECT in get_stale_test_tables.sql to prefer
v_catalog.tables.table_catalog when available, fall back to querying the current
database, and concatenate table_catalog || '.' || table_schema || '.' ||
table_name.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 62ed937f-a49e-40f5-8036-de535a202fab

📥 Commits

Reviewing files that changed from the base of the PR and between cae3659 and 6675187.

📒 Files selected for processing (3)
  • integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql
  • integration_tests/tests/test_cleanup_stale_test_tables.py
  • macros/edr/tests/test_utils/get_stale_test_tables.sql
🚧 Files skipped from review as they are similar to previous changes (2)
  • integration_tests/tests/test_cleanup_stale_test_tables.py
  • integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql

Comment thread macros/edr/tests/test_utils/get_stale_test_tables.sql Outdated
GuyEshdat and others added 10 commits June 7, 2026 14:18
…ables

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…le_schema stores full path

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…eration)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…er, restore -k CI flag

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ime filtering

Warehouses without creation-time support (postgres, redshift, athena, trino, duckdb)
are removed and skipped in tests — without a time filter the macro could delete
temp tables actively in use by a concurrent dbt run.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@GuyEshdat GuyEshdat merged commit 0786485 into master Jun 9, 2026
47 of 50 checks passed
@GuyEshdat GuyEshdat deleted the delete-stale-test-tables branch June 9, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants