Skip to content

fix: remaining adversarial-review findings (M4/M5/M6/M7, S2/S3, A4, F2, P2, AU6/AU7, R8, U6)#69

Open
russellromney wants to merge 7 commits into
mainfrom
fix/adversarial-remaining
Open

fix: remaining adversarial-review findings (M4/M5/M6/M7, S2/S3, A4, F2, P2, AU6/AU7, R8, U6)#69
russellromney wants to merge 7 commits into
mainfrom
fix/adversarial-remaining

Conversation

@russellromney

@russellromney russellromney commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Works through the remaining tractable findings from ADVERSARIAL_REVIEW_2026-06-25.md, after #62#65 merged. Audited each against current main first (the report's line refs were two main-versions stale); all below were still open. Themed commits; full cargo test + clippy -D warnings + fmt green, with regression tests for the key fixes.

Fixed

  • M4 — order "newer never loses" by the commit's history depth (generation, from git rev-list --count) instead of the builder's wall clock, so cross-host clock skew can't reorder. Falls back to synced_at for refs without it. (Known, documented limit: a force-push rewind to an older commit lags until the next forward push.) (tests)
  • M5 — read cache served a ref that lost the durable ordering check for the TTL window; invalidate on save. (test)
  • P2HashingWriter re-hashed the remainder on a short write → bad pack trailer for >256 KiB-compressing blobs. (test)
  • S2 [High] — local retention could delete the only copy of a still-referenced artifact; now protects the ref-reachable set each run (walk shared with remote GC). (test)
  • A4 — a push during an in-flight (claimed) build was coalesced and dropped; coalesce only onto queued jobs, allow a queued job behind a claimed one. (test)
  • M6 — MySQL VARCHAR keys silently truncated/collided; reject over-long keys.
  • M7 — networked queue + per-host file metadata silently lost refs; warn loudly at startup (not refuse — single-host shared-filesystem farm-out is valid).
  • S3 — GC continued past a manifest read error (could delete live objects); fail the pass closed.
  • F2 — unbounded no-dictionary decompress; cap at the frame's raw length.
  • U6 — racy exists()-then-create_dir; use the atomic form.
  • AU6 — artifact redirect ignored private TTL; use the visibility-aware TTL.
  • R8 — HTTP client builder silently dropped auth headers on build error; fail fast.
  • AU7 — token file briefly world-readable + swallowed chmod error; create 0o600 up front, surface errors.

Deferred (with reasons)

  • U3 (fsync durability — opt-in, multi-site), U4 (io_uring batch chunking), U5 (close-after-submit race), U7 (dead-code cleanup), P6 (empty-repo clone), AU8 (configurable auth header) — the next mechanical batch; the io_uring ones are Linux-only (Docker-typechecked).
  • S4 (download range/resume), P4 (sha256 object format) — features.
  • R4 + R5–R12 (perf) — want before/after benchmarks (like perf(client/extract): cut hot-path copies + a per-file lock (R1/R2/R3) #65).
  • P3 (head-delta cold-base fallback), P5 (LSM ancestry) — deep build-path logic whose force-push/GC failure scenarios can't be reproduced here to verify.
  • The report's "areas needing deeper review" — investigations.

🤖 Generated with Claude Code

russellromney and others added 7 commits June 26, 2026 15:40
- The read cache wrote every ref into the cache even when the durable store
  skipped it as older, so a write that lost the ordering check could still be
  served from cache for the TTL window. Invalidate the cache entry on save
  instead; the next load reads the kept value through.
- The pack HashingWriter hashed the whole buffer before a possibly-short inner
  write, so write_all's retry re-hashed the unwritten remainder — a wrong pack
  trailer that index-pack rejects for blobs whose compressed form spans a short
  write. Hash only the bytes the inner writer accepted.

Refs: ADVERSARIAL_REVIEW_2026-06-25.md M5, P2

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eject mismatch (S2, S3, M7)

- Local retention only protected a best-effort side list, and on a local-only
  backend the durability check is a no-op — so a stale/incomplete list could
  delete a still-referenced artifact (its only copy). Retention now also
  protects the set the live refs actually point at, computed each run from the
  ref store (the reachable walk is shared with remote GC).
- Remote GC continued past a manifest read error, leaving that manifest's
  chunks out of the reachable set — a brief failure could delete live objects.
  Fail the whole pass instead.
- A network queue paired with a per-host file metadata store silently lost
  refs (server and workers each read their own files). Refuse that combination
  at startup with a clear message.

Refs: ADVERSARIAL_REVIEW_2026-06-25.md S2, S3, M7

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…A4, M6)

- A push that arrived while the prior build for the same key was already
  claimed (and had already fetched) was coalesced onto that in-flight build, so
  the newer commit wasn't built until the next push. Coalescing now targets
  only queued jobs, and the unique index is relaxed to one queued job per key
  so a claimed build and a fresh queued one can coexist — the newer commit gets
  its own job and builds next.
- MySQL key columns are VARCHAR (the composite PK can't be TEXT); an over-long
  repo/branch/queue key would silently truncate and collide. Reject it with a
  clear error instead.

Refs: ADVERSARIAL_REVIEW_2026-06-25.md A4, M6

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (F2, U6)

- The no-dictionary frame decompress used unbounded `decode_all`, so a corrupt
  or hostile frame could expand without limit before the length check. Cap it
  at the frame's declared raw length, like the dictionary branch.
- The directory-creation helper probed with `exists()` before `create_dir`,
  which races concurrent producers (one wins, the rest hit EEXIST). Use the
  atomic create-then-tolerate-AlreadyExists form.

Refs: ADVERSARIAL_REVIEW_2026-06-25.md F2, U6

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…0 token (AU6, R8, AU7)

- The direct-artifact redirect signed URLs with a flat 15-minute TTL, ignoring
  repo visibility. Use the same shorter TTL for private repos as the ref path.
- The HTTP client builder fell back to a header-less client on a build error,
  silently dropping the auth headers so every request went out unauthenticated.
  Fail fast instead.
- The token file was written with the default umask then chmod'd 0o600 (briefly
  world-readable) and the chmod error was ignored. Create it 0o600 up front and
  surface a permission error.

Refs: ADVERSARIAL_REVIEW_2026-06-25.md AU6, R8, AU7

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (M7)

Refusing to start broke valid single-host farm-out (a networked queue with the
server and workers sharing one filesystem and its file metadata). We can't tell
shared-filesystem from multi-host here, so warn loudly instead of bailing — the
goal is to end the silent footgun, not block a working setup.

Refs: ADVERSARIAL_REVIEW_2026-06-25.md M7

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
"A newer sync never loses" ordered by synced_at — the builder's per-host
wall clock — so cross-host clock skew or two writes in the same second could
pick the wrong winner. Order by the commit's history depth instead (its
`generation`, from `git rev-list --count`): recency follows the commit's place
in git history, which is the same for every builder, so skew can't reorder it.

- RefInfo carries `generation`; the build computes it once via `git rev-list
  --count` (the commit-graph we write makes it cheap), set wherever synced_at is.
- The compare (shared `should_replace_ref` + every SQL `save_ordered`) now: same
  commit wins; else higher-or-equal generation wins; else (a side without a
  generation) fall back to synced_at. `refs` tables gain a `generation` column
  (with a best-effort migration); the comparison stays a single atomic upsert.
- synced_at is kept as the fallback for refs/repos without a generation.

Known limitation (documented at the compare): a force-push that rewinds a
branch to an older commit has a lower generation, so it lags until the next
forward push (a same-commit re-sync still updates). That is the cost of ordering
by history rather than by observation time.

Tests: higher generation wins over a newer wall clock (file store + the SQL
lifecycle across sqlite/pg/mysql).

Refs: ADVERSARIAL_REVIEW_2026-06-25.md M4

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@russellromney russellromney changed the title fix: remaining adversarial-review findings (M5, P2, S2, A4, M6, M7, S3, F2, AU6, AU7, R8, U6) fix: remaining adversarial-review findings (M4/M5/M6/M7, S2/S3, A4, F2, P2, AU6/AU7, R8, U6) Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant