fix(bigquery): treat unsupported dataset region as non-retryable#66388
Open
Gilbert09 wants to merge 2 commits into
Open
fix(bigquery): treat unsupported dataset region as non-retryable#66388Gilbert09 wants to merge 2 commits into
Gilbert09 wants to merge 2 commits into
Conversation
A BigQuery query job created in a location that can't run jobs fails with a
400 BadRequest ("Location <x> does not support this operation."). The location
comes from the source's custom-region field or the dataset's auto-detected
location, so it's a deterministic config error — retrying always fails the same
way. Match the stable "does not support this operation" wording so the sync
stops instead of retrying forever, and surface an actionable message telling the
user to set a valid region.
Generated-By: PostHog Code
Task-Id: e2ebe548-1d8d-407d-9706-4e50be717dbb
Contributor
|
Hey @Gilbert09! 👋 It looks like your git author email on this PR isn't your
You can fix it for this repo with: git config user.email "you@posthog.com"Or set it globally with |
Contributor
|
Reviews (1): Last reviewed commit: "fix(bigquery): treat unsupported dataset..." | Re-trigger Greptile |
Contributor
Query snapshots: Backend query snapshots updatedChanges: 1 snapshots (1 modified, 0 added, 0 deleted) What this means:
Next steps:
|
|
Retaining stamphog approval — delta since last review classified as |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A BigQuery data-warehouse source is failing every retry with an uncaught
BadRequestraised from job creation:The location BigQuery runs the job in comes from either the source's custom-region field or the dataset's location auto-detected in
connect(). If that value isn't a region BigQuery can run query jobs in, every job creation (POST .../jobs) fails identically.get_columnsonly catchesForbidden/NotFound/TypeError/RefreshError, so thisBadRequestpropagates uncaught, and no key inget_non_retryable_errorsmatched it — so the sync retried forever and kept the issue spamming error tracking.Surfaced by PostHog error tracking (issue
019f04b0-4137-7ed1-bd98-be3c2ba4ade7), exception typeBadRequest, originating inproducts/warehouse_sources/backend/temporal/data_imports/sources/bigquery/bigquery.py.Changes
Add
"does not support this operation"toBigQuerySource.get_non_retryable_errors()with an actionable message asking the user to set a valid BigQuery region. This is a deterministic config error — the same location always fails the same way, so there is nothing to recover by retrying. The existing"was not found in location"key only covers a dataset missing from a queried region, not an unsupported region, so this case slipped through.The match is on the stable wording only; the volatile location code is deliberately excluded so it can't cause false positives.
How did you test this code?
I'm an agent (Claude Code). I added a parametrized regression test (
test_bigquery_unsupported_region_is_non_retryable) asserting the unsupported-regionBadRequestis recognised as non-retryable across several location values, and that the match never depends on the volatile location code. No existing test covered the unsupported-region wording — the closest,was not found in location, is a different condition. The existing transient-error guard tests still confirm the new substring doesn't make any transient error non-retryable.Automated tests I ran (all green):
test_source.py -k "non_retryable or unsupported_region or transient"→ 35 passedtest_source.py→ 91 passedruff check+ruff format --check→ clean🤖 Agent context
Autonomy: Fully autonomous
Triaged a live BigQuery error-tracking issue. Read the stack and confirmed the failure genuinely originates in this source's
bigquery.pyjob-creation path rather than only appearing in serialized log context. Decided this is an upstream/config error (the region is supplied by the customer's config or their dataset metadata; retrying can't fix an unsupported location) rather than a fixable bug, so extendedNonRetryableErrorsfollowing the established pattern (mirrors the Stripeget_non_retryable_errorsapproach and the existing BigQuery region/billing/quota entries). Matched on a stable, non-sensitive substring and kept the customer's region value out of the code, tests, and this description. Checked open PRs (including the maintainer's) from several angles — no existing PR addresses this error.