Skip to content

fix(bigquery): treat unsupported dataset region as non-retryable#66388

Open
Gilbert09 wants to merge 2 commits into
masterfrom
posthog-code/bigquery-unsupported-region-non-retryable
Open

fix(bigquery): treat unsupported dataset region as non-retryable#66388
Gilbert09 wants to merge 2 commits into
masterfrom
posthog-code/bigquery-unsupported-region-non-retryable

Conversation

@Gilbert09

Copy link
Copy Markdown
Member

Problem

A BigQuery data-warehouse source is failing every retry with an uncaught BadRequest raised from job creation:

POST https://bigquery.googleapis.com/.../jobs?prettyPrint=false: Location <region> does not support this operation.

The location BigQuery runs the job in comes from either the source's custom-region field or the dataset's location auto-detected in connect(). If that value isn't a region BigQuery can run query jobs in, every job creation (POST .../jobs) fails identically. get_columns only catches Forbidden/NotFound/TypeError/RefreshError, so this BadRequest propagates uncaught, and no key in get_non_retryable_errors matched it — so the sync retried forever and kept the issue spamming error tracking.

Surfaced by PostHog error tracking (issue 019f04b0-4137-7ed1-bd98-be3c2ba4ade7), exception type BadRequest, originating in products/warehouse_sources/backend/temporal/data_imports/sources/bigquery/bigquery.py.

Changes

Add "does not support this operation" to BigQuerySource.get_non_retryable_errors() with an actionable message asking the user to set a valid BigQuery region. This is a deterministic config error — the same location always fails the same way, so there is nothing to recover by retrying. The existing "was not found in location" key only covers a dataset missing from a queried region, not an unsupported region, so this case slipped through.

The match is on the stable wording only; the volatile location code is deliberately excluded so it can't cause false positives.

How did you test this code?

I'm an agent (Claude Code). I added a parametrized regression test (test_bigquery_unsupported_region_is_non_retryable) asserting the unsupported-region BadRequest is recognised as non-retryable across several location values, and that the match never depends on the volatile location code. No existing test covered the unsupported-region wording — the closest, was not found in location, is a different condition. The existing transient-error guard tests still confirm the new substring doesn't make any transient error non-retryable.

Automated tests I ran (all green):

  • test_source.py -k "non_retryable or unsupported_region or transient" → 35 passed
  • full test_source.py → 91 passed
  • ruff check + ruff format --check → clean

🤖 Agent context

Autonomy: Fully autonomous

Triaged a live BigQuery error-tracking issue. Read the stack and confirmed the failure genuinely originates in this source's bigquery.py job-creation path rather than only appearing in serialized log context. Decided this is an upstream/config error (the region is supplied by the customer's config or their dataset metadata; retrying can't fix an unsupported location) rather than a fixable bug, so extended NonRetryableErrors following the established pattern (mirrors the Stripe get_non_retryable_errors approach and the existing BigQuery region/billing/quota entries). Matched on a stable, non-sensitive substring and kept the customer's region value out of the code, tests, and this description. Checked open PRs (including the maintainer's) from several angles — no existing PR addresses this error.

A BigQuery query job created in a location that can't run jobs fails with a
400 BadRequest ("Location <x> does not support this operation."). The location
comes from the source's custom-region field or the dataset's auto-detected
location, so it's a deterministic config error — retrying always fails the same
way. Match the stable "does not support this operation" wording so the sync
stops instead of retrying forever, and surface an actionable message telling the
user to set a valid region.

Generated-By: PostHog Code
Task-Id: e2ebe548-1d8d-407d-9706-4e50be717dbb
@Gilbert09 Gilbert09 added the stamphog Request AI approval (no full review) label Jun 26, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Hey @Gilbert09! 👋

It looks like your git author email on this PR isn't your @posthog.com address (owerstom@gmail.com). Since you're on the PostHog team, it's worth pointing your local git author email at your @posthog.com address. Why it matters:

  • Consistent work identity in git history — internal tooling that attributes commits to team members keys off your @posthog.com address.
  • Keeps team contributions easy to tell apart from external community ones when scanning history.

You can fix it for this repo with:

git config user.email "you@posthog.com"

Or set it globally with git config --global user.email "you@posthog.com". No need to redo this PR — just a nudge for next time. 🙂

@assign-reviewers-posthog assign-reviewers-posthog Bot requested a review from a team June 26, 2026 16:14

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, low-risk fix that correctly classifies an unsupported BigQuery region as a non-retryable configuration error, backed by a solid parametrized test. No production risk, no API contract changes, and author is on the owning team.

@greptile-apps

greptile-apps Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "fix(bigquery): treat unsupported dataset..." | Re-trigger Greptile

@tests-posthog

tests-posthog Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Query snapshots: Backend query snapshots updated

Changes: 1 snapshots (1 modified, 0 added, 0 deleted)

What this means:

  • Query snapshots have been automatically updated to match current output
  • These changes reflect modifications to database queries or schema

Next steps:

  • Review the query changes to ensure they're intentional
  • If unexpected, investigate what caused the query to change

Review snapshot changes →

@stamphog

stamphog Bot commented Jun 26, 2026

Copy link
Copy Markdown

Retaining stamphog approval — delta since last review classified as trivial_paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stamphog Request AI approval (no full review)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant