[HWORKS-2802 / -2807] Document partitioned_by parameter on feature group creation#585
Draft
jimdowling wants to merge 1 commit into
Draft
[HWORKS-2802 / -2807] Document partitioned_by parameter on feature group creation#585jimdowling wants to merge 1 commit into
jimdowling wants to merge 1 commit into
Conversation
…tion https://hopsworks.atlassian.net/browse/HWORKS-2802 Add a section to docs/user_guides/fs/feature_group/create.md describing the storage-engine-native partitioned_by parameter for Delta feature groups. Covers: - Usage example with create_feature_group / get_or_create_feature_group. - The CREATE TABLE … USING DELTA … GENERATED ALWAYS AS … contract: the storage layer derives the partition columns; the user's dataframe never carries them. - Validation rules: mutual exclusion with partition_key, requires event_time. - Partition pruning table — Delta auto-derives partition predicates from the GENERATED expressions for hierarchical specs (year / year+month / year+month+day / year+month+day+hour), so `fg.read(start_time=..., end_time=...)` and `fg.filter(fg.event_time >= ...)` prune at the partition level. Non-hierarchical specs (e.g. ["month"], ["year","week"]) are valid but skip the auto-derivation — only direct predicates on the grain columns prune. Recommend hierarchical specs. - Online feature store behavior: derived columns live offline-only by default; online_partition_columns=true opts into online materialization. Until the onlinefs consumer filter ships, the backend rejects partitioned_by + online_enabled=true with the default online_partition_columns=false. Document both workarounds. - Hudi: partitioned_by + HUDI is rejected at creation; Hudi support is tracked under a separate follow-up ticket. Signed-off-by: Jim Dowling <jim@logicalclocks.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
User-guide section documenting the new
partitioned_byparameter on feature group creation. Lives under the existing partitioning area indocs/user_guides/fs/feature_group/create.md.Covers:
create_feature_group/get_or_create_feature_group.GENERATED ALWAYS AShandles it server-side.partition_key, requiresevent_time, enum membership).fg.read(start_time, end_time)andfg.filter(fg.event_time >= ...)prune at the partition level for hierarchicalpartitioned_by. Non-hierarchical specs (["month"],["year","week"]) are valid but skip auto-derivation.online_partition_columns=trueopts into online materialization.PartitionedByTransformer+CustomKeyGenerator.Pairs with:
JIRA: HWORKS-2802. Engineering walkthrough: Confluence page.
Test plan
npx markdownlint-cli2 docs/user_guides/fs/feature_group/create.mdclean.uv run mkdocs build -sclean (run after the SDK PR lands, since the API reference plugin pulls fromhopsworks-apimain).mkdocs serve.🤖 Generated with Claude Code