Skip to content

feat: add file_clone (reflink/CoW) mode for the local disk cache#2739

Open
Skory wants to merge 1 commit into
mozilla:mainfrom
Skory:file-clone-reflink-cache
Open

feat: add file_clone (reflink/CoW) mode for the local disk cache#2739
Skory wants to merge 1 commit into
mozilla:mainfrom
Skory:file-clone-reflink-cache

Conversation

@Skory

@Skory Skory commented Jun 16, 2026

Copy link
Copy Markdown

Summary

Adds an opt-in file_clone mode to sccache's local disk cache. Instead of storing
entries compressed (zip + zstd) and decompressing a fresh, fully-allocated copy on every
cache hit, file_clone stores entries uncompressed and restores them with filesystem
reflinks (copy-on-write): FICLONE on Linux (Btrfs/XFS/ZFS/…), clonefile on macOS
(APFS), block-cloning on Windows (ReFS), with a transparent copy fallback on non-CoW
filesystems.

This is opt-in and off by default; the existing compressed cache path is byte-for-byte
unchanged.

This is a fresh implementation of the idea in #2640 (credit @quake) with the open gaps
closed and some performance optimisations. Refs #1053, #1174.

Motivation

On a cache hit the disk cache decompresses a freshly-allocated copy of every artifact
into the build tree. When the same outputs are restored over and over, that physical
duplication adds up:

  • Frequent cargo clean / branch-switch rebuilds re-extract the same artifacts again
    and again (the original ask in Support reflinks & hardlinks #1053 / Disk cache and clean rebuild speed #1174).
  • Many concurrent agents / developers working in parallel git worktrees. This is
    increasingly common: each worktree builds the same dependencies and restores the same
    outputs from the cache, so the identical bytes get written to disk once per worktree
    N worktrees means N full copies of the same artifacts. On multi-worktree machines this
    is the dominant source of disk pressure.

With file_clone, restored files share underlying storage blocks with the cache entry
on a CoW filesystem, so:

  • each additional worktree restoring the same artifacts costs ~0 marginal disk for the
    cached bytes (blocks are shared, not duplicated);
  • warm restores issue far fewer physical writes (better SSD endurance);
  • there is no decompression on the hit path.

The trade-off is a larger, uncompressed cache (compression is given up to enable block
sharing). Because it is opt-in, nobody pays that cost unless they ask for it.

What's included

  • Config: file_clone under [cache.disk] and SCCACHE_FILE_CLONE (default false).
  • Reflink-on-write via a new Storage::put_objects (the default impl preserves today's
    zip+zstd behaviour for every backend; DiskCache overrides it to reflink the compiler's
    original outputs straight into the cache — no compress-then-decompress round trip).
  • Linux fd-level FICLONE fast path; non-Linux uses the reflink-copy crate, with an
    in-place-overwrite fallback for read-only destination dirs.
  • Uncompressed directory entries in LruDiskCache (gated behind a flag — the default
    cache and the preprocessor cache keep the existing single-file path unchanged), with
    atomic staging + rename, an out-of-band mode manifest, and 0600 cache files / 0700
    dirs.
  • New --show-stats counters: objects_reflinked / objects_copied_fallback, so the
    reflink-vs-copy ratio is visible.
  • Per-device memo so an unsupported destination filesystem warns once instead of retrying.
  • Docs (docs/FileClone.md, docs/Local.md, docs/Configuration.md), a manual multi-repo
    benchmark (scripts/bench-file-clone.sh, not wired into CI; uses compsize on btrfs to
    measure actual on-disk block sharing), plus integration, system, and unit tests.

Benchmarks

Two environments. Absolute times are not comparable across the two tables (different
machines); compare only within each table. In both, file_clone is comparable in build
time
to the compressed cache and the win is disk sharing, not speed.

Linux / btrfs

On-disk usage measured with compsize (the
authoritative btrfs tool — it counts shared/reflinked extents once). restore marginal disk = compsize disk of (cache + restored tree) − compsize disk of the cache alone, i.e.
the new disk a warm restore actually consumes. It is ~0 only when the restored artifacts
reflink the cache, and it cancels out btrfs transparent compression (which affects both
terms equally).

target cold warm (compressed) warm (file_clone) compressed cache file_clone cache restored (logical) cache+restore on disk restore marginal disk reflink/copy
local-c 2.33s 0.23s 0.22s 2.3M 8.0M 6.7M 2.0M 0M 120/0
ripgrep 6.77s 4.76s 4.33s 28.7M 108.7M 340.8M 148.5M 115.6M 75/0
fd 38.51s 15.54s 14.58s 88.9M 347.9M 342.6M 160.5M 58.2M 434/0
bat 12.35s 10.63s 7.81s 140.6M 502.3M 903.6M 309.7M 151.6M 758/0

The same restore marginal disk measured against the compressed cache is much higher —
local-c 1.9M, ripgrep 148.5M, fd 116.2M, bat 306.6M — because a compressed restore writes
fresh, unshared blocks. The gap is the disk the reflink sharing saves on every restore
(e.g. bat 151.6M vs 306.6M ≈ 155M saved per restore; local-c, a pure-compile workload
where every object is cached, shares 100% → 0 marginal). reflink/copy = N/0 confirms
every cached object was reflinked. (For the cargo targets the non-zero marginal is mostly
the freshly linked binary and cargo's incremental/fingerprint files, which sccache does not
cache.)

macOS / APFS

APFS has no compsize equivalent, so sharing is measured with the volume free-space delta
(df) during the warm restore — clone Δdisk well below zip Δdisk means blocks were
shared. Δdisk is noisy and includes non-cached build outputs; the reflink/copy counter is
the authoritative proof.

target cold warm (zip) warm (clone) zip cache clone cache restored reflink/copy clone Δdisk (vs zip)
local-c 6.82s 4.20s 4.17s 0.6M 1.2M 2.3M 150/0 ~0.0M (zip ~0.0M)
fd 4.13s 2.53s 3.06s 34.9M 105.4M 162.8M 172/0 ~35.8M (zip ~173.0M)
ripgrep 5.99s 3.22s 3.06s 26.5M 83.6M 252.0M 75/0 ~138.2M (zip ~232.4M)
bat 15.04s 8.38s 12.34s 140.4M 406.4M 631.0M 761/0 ~183.0M (zip ~561.0M)

Takeaways

  • CoW engaged everywhere: reflink/copy = N/0 on every target on both filesystems —
    100% of cached objects restored via reflink, zero copy fallbacks.
  • Build time ≈ neutral: within run-to-run noise vs the compressed cache (sometimes a
    touch faster from skipping decompression, sometimes a touch slower from per-file clone
    syscalls). This is not a build-speed feature.
  • Disk is the win: a warm restore allocates far fewer new blocks because artifacts share
    storage with the cache (btrfs restore marginal disk: bat 151.6M vs 306.6M for the
    compressed cache, local-c 0M; APFS clone Δdisk: fd 35.8M vs 173.0M, bat 183.0M vs
    561.0M). Multiplied across many worktrees this is the headline benefit.
  • Cost: the uncompressed cache is ~2–4× larger than the compressed cache (compression
    traded for block sharing).

Notes / requirements

  • The disk/space win only materialises when SCCACHE_DIR and the build tree are on the
    same CoW filesystem
    ; otherwise reflink transparently falls back to a full copy (and the
    objects_copied_fallback stat goes up).
  • Reflinks are independent CoW copies, not hardlinks — editing a restored file does not
    touch the cache entry.
  • Default behaviour (compressed cache) is unchanged; this only affects users who opt in.

@Skory Skory force-pushed the file-clone-reflink-cache branch 2 times, most recently from 7bc864f to 7109e48 Compare June 17, 2026 09:28
@Skory Skory marked this pull request as ready for review June 17, 2026 11:53
@codecov-commenter

codecov-commenter commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.95002% with 138 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.62%. Comparing base (d8a93ad) to head (f59d7d8).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/lru_disk_cache/mod.rs 92.13% 42 Missing ⚠️
src/cache/disk.rs 94.68% 36 Missing ⚠️
src/cache/cache_io.rs 95.62% 18 Missing ⚠️
src/reflink.rs 94.23% 17 Missing ⚠️
src/cache/multilevel_test.rs 82.08% 12 Missing ⚠️
src/compiler/compiler.rs 95.07% 7 Missing ⚠️
src/config.rs 95.31% 3 Missing ⚠️
tests/system.rs 94.59% 2 Missing ⚠️
src/server.rs 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2739      +/-   ##
==========================================
+ Coverage   74.51%   75.62%   +1.11%     
==========================================
  Files          70       71       +1     
  Lines       39654    42007    +2353     
==========================================
+ Hits        29549    31769    +2220     
- Misses      10105    10238     +133     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Opt-in file_clone stores disk-cache entries uncompressed and restores them with
filesystem reflinks (FICLONE on Linux, clonefile on macOS/APFS, ReFS on Windows),
falling back to plain copies on non-CoW filesystems. The default compressed
(zip+zstd) cache is unchanged.

- config: file_clone in [cache.disk] and SCCACHE_FILE_CLONE (default false)
- reflink-on-write via Storage::put_objects (no zip/zstd); fd-level FICLONE fast
  path on Linux, in-place overwrite fallback for read-only dirs on non-Linux
- uncompressed directory entries in LruDiskCache (gated; default path unchanged),
  atomic staging+rename, out-of-band mode manifest, 0600 cache files / 0700 dirs
- new --show-stats counters: objects_reflinked / objects_copied_fallback
- docs/FileClone.md, scripts/bench-file-clone.sh, integration + system + unit tests

Refs mozilla#1053, mozilla#1174; reimplements mozilla#2640 (credit @quake).
@Skory Skory force-pushed the file-clone-reflink-cache branch from 7109e48 to f59d7d8 Compare June 18, 2026 18:55
@sylvestre

Copy link
Copy Markdown
Collaborator

in the future, please keep only the most important information in comment #0

@Skory

Skory commented Jun 22, 2026

Copy link
Copy Markdown
Author

@sylvestre do you know who is the best PoC to review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants