Skip to content

Add BigQuery platform with engine-only integration tests#774

Open
JunWang222 wants to merge 14 commits into
apache:mainfrom
JunWang222:bigquery-engine-platform-only-test
Open

Add BigQuery platform with engine-only integration tests#774
JunWang222 wants to merge 14 commits into
apache:mainfrom
JunWang222:bigquery-engine-platform-only-test

Conversation

@JunWang222

Copy link
Copy Markdown

Summary

This PR adds the Wayang BigQuery platform implementation and engine-only integration tests.

What is included

  • Adds wayang-bigquery platform module.
  • Adds BigQuery mappings/operators for:
    • TableSource
    • Filter
    • Projection
    • Join
    • GlobalReduce
    • ReduceBy
    • Sort
    • TableSink
  • Adds end-to-end BigQueryOperatorsIT coverage for each operator.
  • Adds JavaPlanBuilder integration tests that simulate user-facing API usage.
  • Keeps the integration tests engine-only by registering only BigQuery.plugin().
  • Makes the BigQuery tests self-contained by creating inline fixture tables in the test setup, without requiring an external dataset.

Validation

  • Rebased on latest apache/wayang:main.
  • Verified against real BigQuery

JunWang222 and others added 14 commits June 28, 2026 21:09
Run the whole Wayang plan, including the terminal sink, inside BigQuery:
register only BigQuery.plugin(), end every BigQueryOperatorsIT test in a
TableSink that compiles to CREATE TABLE `proj.ds.t` AS SELECT, and assert
results via plain JDBC after execute() returns. With no Java plugin the
optimizer must push down, so the sink table's contents prove in-engine
execution (BigQuery has no system.runtime.queries).

- jdbc-template JdbcExecutor.executeSinkStage: use selectStartTask for
  multi-source joins and collect global-reduce/reduce-by/sort into the
  composed CREATE TABLE AS SELECT (ported from wayang-trino-only-test).
- BigQueryOperatorsIT: 13 engine-only tests (8 operator-level + 5
  JavaPlanBuilder); join Tuple2->Record handled by a test-only flatten
  mapping to BigQueryProjectionOperator; lookup key renamed region_name to
  avoid a duplicate column in the CTAS. DDL-only (free-tier safe).
- improvement.md: document the engine-only shape.

Verified against real BigQuery: Tests run: 13, Failures: 0, Errors: 0,
Skipped: 0; 13 CREATE TABLE AS SELECT executed in BigQuery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant