Skip to content

Add --external-only flag to cargo chef prepare to split third-party and workspace-internal cache layers #359

Description

@josecelano

Problem

In large Cargo workspaces the standard cargo-chef pattern produces a single cook layer that is invalidated by any structural change to the workspace manifests — not just changes to external dependencies.

RUN cargo chef prepare --recipe-path recipe.json   # captures all manifests
RUN cargo chef cook   --recipe-path recipe.json    # invalidated too often

recipe.json includes every [dependencies.my-crate] path = "../my-crate" entry from every workspace manifest. Any of the following changes forces a full recompile of all third-party crates:

Change Invalidates cook layer?
Adding a new workspace-internal crate Yes — unexpected
Splitting one internal crate into two Yes — unexpected
Adding a [[bin]] target to a workspace crate Yes — unexpected
Renaming a workspace crate Yes — unexpected
Adding a new external dep Yes — expected

In the real-world project that motivated this issue (torrust-tracker, 26 workspace crates) almost every feature branch touches at least one manifest. The cook cache is effectively always cold, and all third-party crates are recompiled from source on every CI run.

This has been reported before: see #314 and #75.

Proposed solution

Add an --external-only flag to cargo chef prepare that strips all intra-workspace path = "..." dependencies before serialising the recipe, producing a recipe that is only invalidated when an external dependency changes.

# Stable layer — only invalidated when external deps change
RUN cargo chef prepare --external-only --recipe-path recipe-thirdparty.json
RUN cargo chef cook   --release       --recipe-path recipe-thirdparty.json

# Fast layer — invalidated on any manifest change, but third-party is already warm
RUN cargo chef prepare --recipe-path recipe.json
RUN cargo chef cook   --release      --recipe-path recipe.json

In a 3-crate example workspace the savings are 83×: cold cook goes from 5.8 s → 0.07 s once the thirdparty layer is warm.

Implementation

I have a working implementation ready:

  • New src/skeleton/external_only.rs module using toml::Value (already a dependency) for structurally-correct stripping — not regex
  • Handles: top-level dep sections, [target.'cfg(...)'.dependencies], [workspace.dependencies], and { workspace = true } references to path deps (two-pass approach)
  • Removes local (no-source) packages from the lock file so workspace-membership changes don't bust the recipe
  • 6 unit tests + 3 integration tests
  • An examples/workspace-split-cache/ directory that demonstrates the problem and both solutions (Python workaround + native flag), with a step-by-step README

I will open a PR shortly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions