Kafka Connect: recover from metadata commit stall after Kafka cluster recreation#16360
Open
Schiewatir wants to merge 2 commits into
Open
Kafka Connect: recover from metadata commit stall after Kafka cluster recreation#16360Schiewatir wants to merge 2 commits into
Schiewatir wants to merge 2 commits into
Conversation
…ation
After a Kafka cluster is recreated the control topic resets to offset 0,
but the Iceberg snapshot still stores committed offsets from the old cluster
(e.g. {0:100}). Every DataWritten event on the new cluster carries a low
offset (< 100) and the deduplication filter in commitToTable() silently
discards all of them, causing the coordinator to log "nothing to commit"
indefinitely while parquet files accumulate without a snapshot.
Detect this scenario by comparing the coordinator's observed control topic
offsets against the stored committed offsets. When the current offsets are
lower, the control topic was likely reset; skip deduplication and store the
new (lower) baseline so subsequent commits behave correctly.
Fixes apache#15293
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After a Kafka cluster is recreated (old cluster deleted, new one started), the Apache Iceberg Kafka Connect sink stops committing metadata. Parquet files accumulate in object storage but no snapshots are created, and the logs show
"Coordinator … found nothing to commit to table"on every commit cycle. (Reported in #15293.)Root cause: The control topic resets to offset 0 on the new cluster, but the Iceberg snapshot still stores committed offsets from the old cluster (e.g.
{"0":100}). The deduplication filter inCoordinator.commitToTable()rejects everyDataWrittenevent whose offset is below the stored baseline, so all data files are silently dropped.Fix
When a partition's current control-topic offset is lower than its stored committed offset, the committed offset for that partition is reset to the current offset minus one (so the next write is accepted). This is detected per-partition during
commitToTable(), so there is no behaviour change for the normal (non-reset) path.After the reset, the filter accepts the new events, a snapshot is committed, and its stored offset is reset to the new baseline — replication resumes.
Fixes #15293