Skip to content

fix: preserve ANY type through schema round-trip#4995

Open
Ma77Ball wants to merge 2 commits intoapache:mainfrom
Ma77Ball:fix/ArrowUtils
Open

fix: preserve ANY type through schema round-trip#4995
Ma77Ball wants to merge 2 commits intoapache:mainfrom
Ma77Ball:fix/ArrowUtils

Conversation

@Ma77Ball
Copy link
Copy Markdown
Contributor

@Ma77Ball Ma77Ball commented May 9, 2026

What changes were proposed in this PR?

ArrowUtils.fromTexeraSchema now tags ANY attributes with texera_type=ANY metadata on the Arrow field, and toTexeraSchema reads that tag back. This mirrors the existing LARGE_BINARY mechanism. Without it, ANY round-trips
silently became STRING because both types share the same Arrow representation (Utf8).

Any related issues, documentation, or discussions?

Closes: #4762

How was this PR tested?

Updated ArrowUtilsSpec (in common/workflow-core): replaced the test that pinned the bug ("lose the ANY distinction") with one that asserts ANY is preserved through a round-trip, and added a test that the texera_type=ANY
metadata is attached only to ANY fields. Ran both WorkflowCore (27/27) and WorkflowOperator (14/14) ArrowUtilsSpec suites — all pass.

Was this PR authored or co-authored using generative AI tooling?

Co-Authored with Claude Opus 4.7 in compliance with ASF

@Ma77Ball Ma77Ball changed the title preserve ANY through schema round-trip via texera_type metadata fix: preserve ANY type through schema round-trip May 9, 2026
@Ma77Ball
Copy link
Copy Markdown
Contributor Author

Ma77Ball commented May 9, 2026

/request-review @aicam

@github-actions github-actions Bot requested a review from aicam May 9, 2026 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ArrowUtils round-trip loses AttributeType.ANY (becomes STRING)

1 participant