backup: add Redis hash encoder#725
Conversation
First of the Phase 0a wide-column encoders (after the redis_string.go strings/HLL/TTL work in PR #713). Hashes follow a meta+per-field key shape on the live store (`!hs|meta|<len><userKey>` → `[Len(8)]`, `!hs|fld|<len><userKey><field>` → field bytes; see store/hash_helpers.go), so the encoder buffers each user key's fields in memory and flushes one `hashes/<encoded>.json` file per hash at Finalize. Output schema matches the design (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md:327): { "format_version": 1, "fields": {"name": "alice"}, "expire_at_ms": null } Cross-cutting decisions: - TTL is folded INLINE into expire_at_ms (not a separate sidecar) so a restorer can replay the hash in one shot. HandleTTL routes redisKindHash records into the hash state rather than the strings_ttl.jsonl path. - Field VALUES use the same UTF-8-or-base64 envelope introduced for the SQS message body in PR #714: valid UTF-8 → plain JSON string; non-UTF-8 → `{"base64":"..."}`. Field NAMES that are not valid UTF-8 are percent-encoded (EncodeSegment) for the JSON object key — JSON object keys must be strings. - Empty hashes (HLEN==0) still emit a file because their existence is observable to clients (HEXISTS, HLEN). Mismatched declared-vs-observed length surfaces as `redis_hash_length_mismatch` warning. - SHA-fallback user keys feed back through the existing recordIfFallback path so KEYMAP.jsonl still reverses non-reversible filenames. - Output is sorted by user key + field name for deterministic byte-identical dumps across runs (matches the round-9 sort pattern from #716 schemaToPublic). Tests cover: round-trip, empty hash, inline TTL routing, length mismatch warning, binary value envelope, malformed meta, truncated field key, SHA-fallback keymap.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements Redis hash backup support, including handlers for hash metadata and fields, and logic to flush accumulated state to JSON files. The implementation handles binary-safe field values and provides deterministic output. Review feedback identifies a potential panic on 32-bit architectures due to unsafe integer conversion and suggests refactoring the hashState helper to centralize key-kind tracking and reduce redundant string allocations.
| ukLen := int(binary.BigEndian.Uint32(b[:hashUserKeyLenSize])) //nolint:gosec // bounded by len(b) | ||
| if len(b) < hashUserKeyLenSize+ukLen { | ||
| return nil, false | ||
| } | ||
| return b[hashUserKeyLenSize : hashUserKeyLenSize+ukLen], true |
There was a problem hiding this comment.
On 32-bit architectures, converting a uint32 to int can result in a negative value if the high bit is set (e.g., for lengths > 2GB). This would bypass the subsequent length check and cause a panic when slicing b. Using uint64 for the comparison ensures safety across all architectures.
| ukLen := int(binary.BigEndian.Uint32(b[:hashUserKeyLenSize])) //nolint:gosec // bounded by len(b) | |
| if len(b) < hashUserKeyLenSize+ukLen { | |
| return nil, false | |
| } | |
| return b[hashUserKeyLenSize : hashUserKeyLenSize+ukLen], true | |
| ukLen := binary.BigEndian.Uint32(b[:hashUserKeyLenSize]) | |
| if uint64(len(b)) < uint64(hashUserKeyLenSize)+uint64(ukLen) { | |
| return nil, false | |
| } | |
| return b[hashUserKeyLenSize : hashUserKeyLenSize+int(ukLen)], true |
| st := r.hashState(userKey) | ||
| st.declaredLen = int64(binary.BigEndian.Uint64(value)) //nolint:gosec // signed int64 by design | ||
| st.metaSeen = true | ||
| r.kindByKey[string(userKey)] = redisKindHash | ||
| return nil |
There was a problem hiding this comment.
The update to r.kindByKey can be centralized within the hashState helper to avoid redundancy and ensure consistency across all hash-related handlers.
| st := r.hashState(userKey) | |
| st.declaredLen = int64(binary.BigEndian.Uint64(value)) //nolint:gosec // signed int64 by design | |
| st.metaSeen = true | |
| r.kindByKey[string(userKey)] = redisKindHash | |
| return nil | |
| st := r.hashState(userKey) | |
| st.declaredLen = int64(binary.BigEndian.Uint64(value)) //nolint:gosec // signed int64 by design | |
| st.metaSeen = true | |
| return nil |
References
- Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.
| st := r.hashState(userKey) | ||
| st.fields[string(fieldName)] = bytes.Clone(value) | ||
| r.kindByKey[string(userKey)] = redisKindHash | ||
| return nil |
There was a problem hiding this comment.
The update to r.kindByKey is redundant here if centralized in hashState.
| st := r.hashState(userKey) | |
| st.fields[string(fieldName)] = bytes.Clone(value) | |
| r.kindByKey[string(userKey)] = redisKindHash | |
| return nil | |
| st := r.hashState(userKey) | |
| st.fields[string(fieldName)] = bytes.Clone(value) | |
| return nil |
References
- Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.
| func (r *RedisDB) hashState(userKey []byte) *redisHashState { | ||
| if st, ok := r.hashes[string(userKey)]; ok { | ||
| return st | ||
| } | ||
| st := &redisHashState{fields: make(map[string][]byte)} | ||
| r.hashes[string(userKey)] = st | ||
| return st | ||
| } |
There was a problem hiding this comment.
Centralizing the kindByKey update here improves maintainability and avoids repeated map writes in the handlers. Additionally, caching the string conversion of userKey avoids redundant allocations when the state is not found.
| func (r *RedisDB) hashState(userKey []byte) *redisHashState { | |
| if st, ok := r.hashes[string(userKey)]; ok { | |
| return st | |
| } | |
| st := &redisHashState{fields: make(map[string][]byte)} | |
| r.hashes[string(userKey)] = st | |
| return st | |
| } | |
| func (r *RedisDB) hashState(userKey []byte) *redisHashState { | |
| ukStr := string(userKey) | |
| if st, ok := r.hashes[ukStr]; ok { | |
| return st | |
| } | |
| st := &redisHashState{fields: make(map[string][]byte)} | |
| r.hashes[ukStr] = st | |
| r.kindByKey[ukStr] = redisKindHash | |
| return st | |
| } |
References
- Centralize common complex logic into shared helper functions to avoid code duplication and improve maintainability.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72e2b75a61
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| jsonName := name | ||
| if !utf8.ValidString(name) { | ||
| jsonName = EncodeSegment([]byte(name)) | ||
| } |
There was a problem hiding this comment.
Disambiguate encoded and raw hash field names
This conversion only percent-encodes non-UTF8 field names while leaving UTF-8 names unchanged, so distinct Redis fields can collapse to the same JSON key (for example, a literal field name %FF and a binary field name 0xFF). In that case fields[jsonName] overwrites one value, silently dropping data from hashes that contain both forms; Redis hash field names are binary-safe, so this is a valid input shape.
Useful? React with 👍 / 👎.
| for _, name := range names { | ||
| jsonName := name | ||
| if !utf8.ValidString(name) { | ||
| jsonName = EncodeSegment([]byte(name)) |
There was a problem hiding this comment.
Avoid SHA-fallback when encoding hash field names
EncodeSegment is a filename encoder with a 240-byte limit and SHA fallback; applying it to non-UTF8 field names can emit an irreversible fallback token for long names. Because hash field names are stored only as JSON object keys and no KEYMAP is written for per-field names, restoring such hashes cannot reconstruct the original field-name bytes, causing silent key corruption for long binary field names.
Useful? React with 👍 / 👎.
…tion (PR #725, round 1) Codex round 12 raised two P1 findings on commit 72e2b75; Gemini raised one high and three mediums alongside. P1: Hash field name encoding via JSON object key was lossy. - Distinct fields could collapse to the same JSON key: a UTF-8 literal `%FF` and a single binary byte `0xFF` both ended up as `%FF` after the EncodeSegment-on-non-UTF-8 fallback, silently dropping data on hashes with both forms. - Long binary field names (>240 bytes) hit EncodeSegment's SHA fallback which is irreversible at this layer (no per-field KEYMAP), so restore could not reconstruct the original bytes. Fix: change `fields` to a JSON array of typed `{name, value}` records. Both name and value go through the same UTF-8-or-base64 envelope (marshalRedisBinaryValue, mirrors PR #714's SQS body), so arbitrary binary bytes round-trip without collisions and without SHA fallback. High (Gemini): parseUserKeyLenPrefix used `int(uint32)` which wraps to negative on 32-bit when the high bit is set, bypassing the bounds check and causing a slice panic. Compare in `uint64` space before converting. Medium (Gemini, ×3): centralise `kindByKey` registration in the hashState helper so meta/field/TTL paths agree on the kind in one place. Audited callers — HandleHashMeta, HandleHashField, and HandleTTL's `case redisKindHash:` arm — all reach hashState via the same userKey identity, and the TTL caller already knows the kind so the now-implicit kindByKey write is idempotent. No other callers per `grep -n "r\.hashState(" internal/backup/`. Tests: - TestRedisDB_HashRoundTripBasic updated to assert the array shape via new hashFieldArray / hashFieldByName helpers. - TestRedisDB_HashEmptyHashStillEmitsFile, _BinaryFieldValue updated likewise. - NEW TestRedisDB_HashBinaryFieldNameRoundTripsViaArray drives the exact Codex collision pair (UTF-8 `%FF` + binary 0xFF) and asserts both names survive distinctly with their expected types. Verified: linux race tests, GOOS=windows, GOOS=js wasm, lint — all clean.
|
@claude review Round 1 fixes for Codex P1 round 12 + Gemini high/medium on commit 72e2b75 (commit P1 — hash field name encoding was lossy: distinct Redis fields could collapse to the same JSON object key (UTF-8 literal High (Gemini) — 32-bit overflow in length parser: Medium (Gemini ×3) — kindByKey centralisation:
|
|
Claude finished @bootjp's task in 2m 54s —— View job Review — Round 1 fixes (commit
|
…' into feat/backup-phase0a-redis-hash
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4646b9f7f6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if len(fieldName) == 0 { | ||
| return cockroachdberr.Wrapf(ErrRedisInvalidHashKey, | ||
| "empty field name in key %q", key) |
There was a problem hiding this comment.
HandleHashField currently treats a zero-length field name as malformed, but Redis hash field names are binary-safe and may legitimately be empty. In this case a valid key like !hs|fld|<len><userKey> (with no trailing field bytes) will cause backup decoding to error out instead of emitting the hash, so backups can fail on real data created via HSET with an empty field name.
Useful? React with 👍 / 👎.
…' into feat/backup-phase0a-redis-hash
Codex P1 round 13 (commit 4646b9f): HandleHashField was rejecting zero-length field names as ErrRedisInvalidHashKey, but Redis hash field names are binary-safe and `HSET k "" v` is a valid command. The live store emits a key shaped exactly `!hs|fld|<len><userKey>` with no trailing field bytes for that case, so backup decoding would error out on real data created via HSET with an empty field name. Remove the explicit zero-length check; the array-of-records JSON shape introduced in round 1 already handles empty names without collision (an entry like {"name":"", "value":"..."} sorts and round-trips distinctly). Caller audit per loop's audit-semantics rule (fail-closed → fail-open behavior change for empty fieldName): - HandleHashField has only test-code callers; no production pipeline consumes it yet (per `grep -n "HandleHashField" internal/backup/`). - The truncated-key test (TestRedisDB_HashRejectsTruncatedFieldKey) drives a malformed binary key whose parseHashFieldKey returns ok=false; that branch still surfaces ErrRedisInvalidHashKey, independent of the empty-name check we removed. Test: TestRedisDB_HashAcceptsEmptyFieldName drives one empty- name field and one named field, asserts both survive distinct records in the output JSON.
|
@claude review Round 2 fix for Codex P1 round 13 on commit 4646b9f (commit
Removed the explicit zero-length check; the array-of-records JSON shape introduced in round 1 already handles empty names without collision (an entry like Caller audit per loop's audit-semantics rule (fail-closed → fail-open for empty fieldName):
Test: |
|
Claude finished @bootjp's task in 1m 29s —— View job Review — Round 2 fix (commit
|
…' into feat/backup-phase0a-redis-hash
…' into feat/backup-phase0a-redis-hash
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6abf62f547
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // parseHashMetaKey strips !hs|meta| and the 4-byte BE userKeyLen prefix. | ||
| // Returns (userKey, true) on success. | ||
| func parseHashMetaKey(key []byte) ([]byte, bool) { | ||
| rest := bytes.TrimPrefix(key, []byte(RedisHashMetaPrefix)) |
There was a problem hiding this comment.
Exclude hash meta-delta keys from meta parsing
parseHashMetaKey trims RedisHashMetaPrefix, which also matches !hs|meta|d|... delta keys, so those records are treated as base-meta candidates and then fail parsing as malformed keys. Delta keys are a normal part of live Redis hash state (the codebase even documents that delta prefixes share the meta prefix in adapter/redis_retry.go), so when snapshot dispatch sends hash-prefixed records to this encoder, valid snapshots containing unresolved deltas can abort with ErrRedisInvalidHashKey instead of being skipped or handled.
Useful? React with 👍 / 👎.
Codex P1 round 14 (commit 6abf62f): The delta prefix `!hs|meta|d|` shares the base meta prefix `!hs|meta|` (the live store/hash_helpers.go documents the same nesting). A snapshot dispatcher that routes records by "starts with RedisHashMetaPrefix" lands delta records in HandleHashMeta, where parseHashMetaKey trimmed the meta prefix and tried to parse `d|<len>...` as `<len><userKey>` — failing with ErrRedisInvalidHashKey on real snapshots that contain unresolved deltas. Phase 0a's hash output emits an array of OBSERVED fields (`!hs|fld|...` records), so the delta arithmetic the live system uses to recompute Len is not needed at backup time. Skip delta keys silently in HandleHashMeta. Defense-in-depth: parseHashMetaKey also returns `!ok` for delta-prefixed keys so a future refactor that bypasses the HandleHashMeta short-circuit still surfaces a parse failure rather than silent state corruption. Caller audit (per loop's audit-semantics rule for fail-closed → fail-open behavior change): - HandleHashMeta has only test-code callers; no production pipeline consumes it yet. - parseHashMetaKey is called only from HandleHashMeta. The !ok return is mapped to ErrRedisInvalidHashKey in HandleHashMeta's existing path; the delta short-circuit runs first now so the parser path never sees a delta key in normal flow. Test: TestRedisDB_HashHandleHashMetaSkipsDeltaKey synthesises a realistic delta key and asserts both that HandleHashMeta returns nil and that parseHashMetaKey returns !ok directly.
|
@claude review Round 3 fix for Codex P1 round 14 on commit 6abf62f (commit The delta prefix Phase 0a's hash output emits an array of OBSERVED fields ( Caller audit per loop's rule (fail-closed → fail-open behavior):
Test:
|
|
Claude finished @bootjp's task in 1m 40s —— View job Review — Round 3 fix (commit
|
…lob layout
Codex flagged that the wide-column zset encoder skips the legacy
consolidated single-key blob layout the live store still writes.
A zset stored only as `!redis|zset|<userKey>` (with the magic-
prefixed pb.RedisZSetValue body) is silently dropped from backup
output and any inline TTL becomes an orphan — user-visible
sorted-set data loss.
Live-side references (adapter, not changed by this commit):
- adapter/redis_compat_types.go:82 — redisZSetPrefix
- adapter/redis_compat_commands.go:3495-3508 — writes the blob for
non-empty persisted zset updates
- adapter/redis_compat_helpers.go:610-631 — reads it as the
fallback when no wide-column members exist
Fix: new public RedisDB.HandleZSetLegacyBlob method that decodes
the magic-prefixed pb.RedisZSetValue and registers the same per-
member state HandleZSetMember would. The wide-column merge case
(mid-migration snapshot containing BOTH layouts for the same user
key) works because `!redis|zset|` sorts BEFORE `!zs|...` so the
blob arrives first and wide-column rows then update / add members
via the latest-wins map.
Inline TTL: `!redis|zset|<k>` sorts BEFORE `!redis|ttl|<k>`, so
HandleTTL after this handler sees redisKindZSet already and
folds via the case-redisKindZSet branch. No pendingTTL detour
needed for this ordering.
Fail-closed contract (matches existing wide-column path):
- Missing magic prefix → ErrRedisInvalidZSetLegacyBlob
- Unmarshal error → ErrRedisInvalidZSetLegacyBlob
- NaN score → ErrRedisInvalidZSetLegacyBlob (Redis ZADD
rejects NaN at wire level)
Caller audit (per /loop standing instruction): new public method
HandleZSetLegacyBlob has no external callers. Verified via
'grep -rn HandleZSetLegacyBlob --include=*.go' — all matches inside
the test file in this PR. The cmd/elastickv-snapshot-decode
dispatcher (Phase 0a follow-up, not yet built) will route the
`!redis|zset|` prefix to this handler.
Parallel bug class: the SAME issue exists for `!redis|hash|`,
`!redis|set|`, and `!redis|stream|` legacy blob prefixes. Those
encoders shipped in earlier PRs (#725, #758, #791). Each needs
its own legacy-blob handler in a follow-up PR; this commit fixes
only the zset case codex flagged on PR #790.
New tests:
- TestRedisDB_ZSetLegacyBlobRoundTrip — basic round-trip
- TestRedisDB_ZSetLegacyBlobThenWideColumnMerges — mid-migration
- TestRedisDB_ZSetLegacyBlobWithInlineTTL — TTL ordering
- TestRedisDB_ZSetLegacyBlobRejectsMissingMagic — fail-closed
- TestRedisDB_ZSetLegacyBlobRejectsNaNScore — fail-closed
- TestRedisDB_ZSetLegacyBlobRejectsMalformedKey — fail-closed
Self-review:
1. Data loss — exact opposite: this commit RECOVERS zsets that
were silently dropped. New fail-closed guards prevent silently
importing a corrupt blob.
2. Concurrency — no new shared state; per-DB sequential as before.
3. Performance — one protobuf Unmarshal per legacy zset key
(same as the live read path). Member map shares the same
latest-wins behavior as wide-column intake.
4. Consistency — merge order (blob first, wide-column second) is
determined by snapshot lex order; tested explicitly.
5. Coverage — 6 new tests. All 84 redis tests pass.
## Summary Adds the Redis sorted-set (zset) encoder for the Phase 0 logical snapshot decoder (`docs/design/2026_04_29_proposed_snapshot_logical_decoder.md`, lines 334-335). Mirrors the hash/list/set encoders shipped in #725/#755/#758. After this lands, Phase 0a's remaining Redis work is the stream encoder + HLL TTL sidecar; the per-adapter encoders themselves will be complete. Output shape per the design: ```json {"format_version": 1, "members": [{"member": "alice", "score": 100}], "expire_at_ms": null} ``` Members sorted by raw byte order (not by score) so `diff -r` between two dumps with the same logical contents stays line-stable across score-only mutations. Scores emit as JSON numbers for finite values, ZADD-conventional `"+inf"` / `"-inf"` strings for ±Inf (`json.Marshal` rejects non-finite floats). NaN scores fail closed at intake — Redis's ZADD rejects NaN at the wire level, so a NaN at backup time indicates corruption. Wire layout (mirrors `store/zset_helpers.go`): - `!zs|meta|<userKeyLen(4)><userKey>` → 8-byte BE Len - `!zs|mem|<userKeyLen(4)><userKey><member>` → 8-byte IEEE 754 score - `!zs|scr|<userKeyLen(4)><userKey><sortableScore(8)><member>` → silently discarded (secondary index; `!zs|mem|` is the source of truth) - `!zs|meta|d|<userKeyLen(4)><userKey><commitTS(8)><seqInTxn(4)>` → silently skipped (delta family; same policy as hash/set encoders) TTL routing: `!redis|ttl|<userKey>` for a registered zset folds into the JSON `expire_at_ms` field, matching the set/list/hash inlining, so a restorer replays ZADD + EXPIRE in one shot. ## Self-review (5 lenses) 1. **Data loss** — `!zs|mem|` is the source of truth; `!zs|scr|` and delta records are intentional silent skips with audit notes. Empty zsets (Len=0 but meta seen) still emit a file because `TYPE k → zset` is observable to clients. 2. **Concurrency / distributed** — `RedisDB` is sequential per scope (matches the existing per-DB encoder contract); no shared state. 3. **Performance** — per-zset state in a map, flushed once at Finalize. Bounded by `maxWideColumnItems` on the live side; sort is O(n log n) on member-name bytes. Identical cost shape to hash/set encoders. 4. **Data consistency** — JSON field order pinned via struct tags (not map). Inf score uses `json.RawMessage` so the same `score` key emits as either a number or a string. NaN fails closed. 5. **Test coverage** — 17 table-driven tests under `internal/backup/redis_zset_test.go`: - round-trip basic / empty / TTL inlining - binary member via base64 envelope - delta-key skip (both `HandleZSetMeta` entry + `parseZSetMetaKey` guard) - `HandleZSetMetaDelta` explicit entry - `HandleZSetScore` silent-discard - malformed meta length / overflow / MaxInt64 boundary - malformed member-value length / NaN rejection - ±Inf string-form serialization - members-without-meta still emits file - duplicate-members latest-wins (ZADD semantics) ## Caller audit (per `/loop` standing instruction) Semantics-changing edit: new `case redisKindZSet:` branch in `HandleTTL` (redis_string.go:310). Purely additive — the new branch fires only when `zsetState()` has previously registered the key. No existing handler maps to `redisKindZSet`, so no prior call site changes behavior. Verified: ``` grep -n 'redisKindZSet' internal/backup/ # internal/backup/redis_string.go:88 # internal/backup/redis_string.go:310 # internal/backup/redis_zset.go:166 ``` Three references, all new in this PR. No legacy caller assumes the prior behavior. ## Test plan - [x] `go test -race ./internal/backup/` → ok - [x] `golangci-lint run ./internal/backup/...` → 0 issues - [x] `go build ./...` → ok - [x] `go vet ./internal/backup/...` → ok <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added Redis sorted-set (ZSET) backup/restore with deterministic JSON output, proper ±Inf score serialization, and legacy-blob reconciliation. * TTLs for ZSETs and sets can be inlined so restored items retain expiry; pending TTLs are drained when a key is first typed. * Orphan-TTL buffering with configurable byte-cap and a fail-closed overflow behavior; option to disable buffering. * **Tests** * Extensive tests covering ZSET semantics, TTL buffering/drain, legacy-blob handling, byte-cap edge cases, and overflow behavior. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/790?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
!hs|meta|+!hs|fld|records on the live store into onehashes/<encoded>.jsonfile per Redis hash, per the design atdocs/design/2026_04_29_proposed_snapshot_logical_decoder.md:327.RedisDB, the SHA-fallback KEYMAP path, and the dirsCreated cache.expire_at_ms(not a sidecar) so a restorer can replay each hash in one shot.HandleTTLadds aredisKindHashroute.EncodeSegment) when not valid UTF-8 to keep JSON object keys as strings.meta.Lenand observed!hs|fld|keys surface asredis_hash_length_mismatchwarnings.Self-review (per CLAUDE.md "five lenses")
RedisDBis single-writer per the existing contract; the newhashesmap follows the same per-key state model used bykindByKey/inlineTTLEmitted.flushHashesruns once at Finalize. Sort isO(n log n)over user keys + per-hash field names — acceptable at dump cadence.kindByKeyregistration on both meta and field arrival prevents TTL mis-routing across snapshot record-ordering scenarios.Test plan
go test -race ./internal/backup/...— clean.GOOS=windows GOARCH=amd64 go build ./internal/backup/...— clean.GOOS=js GOARCH=wasm go build ./internal/backup/...— clean.golangci-lint run— clean.