concurrency crate rework#1534
Conversation
960e461 to
0ef85f9
Compare
There was a problem hiding this comment.
Pull request overview
Reorganizes the concurrency crate to absorb the former dataplane-quiescent crate and to present a unified, parking_lot-shaped synchronization surface that routes to either real parking_lot/std/crossbeam-utils (default), loom, or shuttle based on build-time concurrency = "..." cfg. Lock APIs (Mutex::lock, RwLock::read/write) now return guards directly, so call sites across NAT, flow tables, flow info, stats, and flow-filter drop their .unwrap()/.expect()/if let Ok(...) boilerplate.
Changes:
- Add
concurrency::sync(parking_lot-shaped facade with poison→panic for loom/shuttle),concurrency::crossbeam(real or fallbackWaitGroup/ShardedLock/Backoff/CachePadded),concurrency::slotandconcurrency::quiescentmodules; introducebuild.rsdrivingconcurrency/dataplane_concurrency_slotcfgs and renamesilence_clippy→_silence_clippy. - Delete the standalone
quiescentcrate and re-route all consumers todataplane_concurrency::quiescent. - Sweep through call sites to drop
LockResulthandling now that the facade panics on poison; add#[cfg(not(miri))]guards around heavyprintln!/traced_testmacros in tests; tweak nextest filter and CI miri/loom matrix to packageconcurrencyinstead ofquiescent.
Reviewed changes
Copilot reviewed 45 out of 47 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml, Cargo.lock | Drop quiescent member, add crossbeam/crossbeam-utils deps, enable tokio parking_lot, register concurrency package metadata. |
| .github/workflows/dev.yml | Replace quiescent-only miri runs with broader "extra important" packages; loom job builds concurrency. |
| justfile | Adjust nextest loom binary filter to use a regex. |
| concurrency/Cargo.toml, build.rs | New features (arc-swap, parking_lot, _strict_provenance, _silence_clippy); build script emits backend cfgs. |
| concurrency/src/lib.rs, macros.rs | New module layout, _silence_clippy rename, doc-inline re-exports for thread. |
| concurrency/src/sync/{mod,default_backend,test_facade}.rs | Backend-routed sync facade; loom/shuttle backend panics on poison. |
| concurrency/src/crossbeam/{mod,standard,fallback}.rs | Real crossbeam-utils on default; local fallbacks (Backoff, CachePadded, WaitGroup, ShardedLock) for loom/shuttle. |
| concurrency/src/slot/{mod,standard,fallback}.rs | Single-slot publication moved out of quiescent. |
| concurrency/src/quiescent.rs, QUIESCENT.md | QSBR primitives moved in; long doc retained (with stale crate references). |
| concurrency/tests/quiescent_*.rs | Tests renamed/retargeted to dataplane_concurrency::quiescent. |
| quiescent/* | Crate deleted. |
| nat/src/{portfw,stateful,icmp_handler,...}, flow-entry/src/flow_table/*, flow-filter/src/{lib,tests}.rs, net/src/flows/{display,flow_info}.rs, stats/src/vpc_stats.rs | Drop LockResult plumbing now that facade returns guards directly; restructure if let Ok(...) chains; switch try_read/try_write consumers to Option. |
| flow-entry/src/flow_table/table.rs | Replace try_read+Duration sleep with tokio::task::yield_now() loop; reshard/insert/lookup/iter use new return types; bolero test reshaped via cloned(). |
| routing/src/fib/test.rs, flow-entry/src/flow_table/{display,nf_lookup}.rs, nat/src/stateful/test.rs | Add #[cfg(not(miri))] around println!/traced_test; reduce miri packet counts. |
1182578 to
f7558f9
Compare
| let guard = flow.locked.read(); | ||
| let Some(state) = guard | ||
| .port_fw_state | ||
| .as_ref() | ||
| .and_then(|s| s.extract_ref::<PortFwState>()) | ||
| else { | ||
| drop(guard); | ||
| debug!("Packet flow-info does not contain port-forwarding state"); | ||
| return None; | ||
| }; | ||
| let out = state.clone(); | ||
| drop(guard); | ||
| debug!("Packet hit entry with port-forwarding state: {flow}"); | ||
| Some(state.clone()) | ||
| Some(out) |
There was a problem hiding this comment.
adding a comment is a good idea. I don't think holding the lock for the log serves much point
79be559 to
80ca8dd
Compare
Introduces `concurrency::sync` as a backend-routed module that exposes a uniform `Mutex` / `RwLock` API regardless of whether the workspace is compiled against the default backend (`parking_lot` locks + `std::sync` for everything else), `loom` (model checking), or `shuttle` (randomized schedule exploration). The default backend is a pure re-export of `parking_lot`'s lock types, so production builds pay no wrapping cost. The test facade wraps the `std::sync`-shaped primitives that `loom` and `shuttle` expose: it strips the `LockResult` poison wrapper and panics on poison instead. This workspace treats a crashed thread as a crashed process, so silently inheriting state from a poisoned lock is wrong; surfacing it as a panic propagates the failure correctly under both model checkers. Also adds: * `concurrency::sync` `Arc<T>` / `Weak<T>` shims that fill the gap in loom 0.7 (no `Weak<T>`, no `Arc::downgrade`). The default and shuttle backends just re-export `std::sync::Arc` / `Weak`; under loom the shim wraps `loom::sync::Arc` to provide the same surface with one documented quirk: `Weak` holds a strong clone, so `weak.upgrade()` succeeds even after the last user-visible `Arc` drops. The quirk is documented at the module level and pinned by `tests/loom_arc_weak.rs`. * `concurrency::slot`: a cfg-gated single-slot atomic publication helper (`arc_swap::ArcSwap` on the default backend, `Mutex<Arc<T>>` fallback when loom/shuttle/`_strict_provenance` is on). * `build.rs` driving the backend and slot-selector cfgs. * Workspace-level wiring: `crossbeam` / `crossbeam-utils` deps, the `tokio` `parking_lot` feature, and a `[workspace.metadata.package .concurrency]` block marking the crate as miri-eligible. Call sites in `flow-entry`, `flow-filter`, `nat`, `net`, `routing`, `stats`, and `quiescent` are swept to drop the `.unwrap()` / `.expect()` noise that the std-shaped `LockResult` forced -- the new facade returns naked guards everywhere. `tests/loom_arc_weak.rs` provides direct coverage for the Arc/Weak shim: strong-count round trips, `ptr_eq`, `downgrade`/`upgrade`, `new_uninit` + `assume_init`, `into_raw`/`from_raw`, `Display` and `Pointer` formatting, and two multi-thread scenarios. The same source runs under default / loom / shuttle; two loom-only tests pin the documented quirks (`Weak` keeps strong; `weak_count` is `0`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Daniel Noland <daniel@githedgehog.com>
80ca8dd to
f0de528
Compare
| # `arc-swap` stays in the default feature set even though the model-checker | ||
| # (miri / loom / shuttle) builds route through `dataplane_concurrency_slot= | ||
| # "fallback"` and never touch the `arc_swap` code path. Keeping the dep | ||
| # enabled by default means production builds (which DO use it) don't pay a | ||
| # resolution-cache miss for swapping the feature on, and downstream `cargo | ||
| # metadata` consumers see a consistent dependency graph regardless of which | ||
| # slot variant is selected. The unused-under-fallback warning is acceptable. | ||
| default = ["arc-swap", "parking_lot"] | ||
|
|
||
| arc-swap = ["dep:arc-swap"] | ||
| loom = ["dep:loom", "concurrency-macros/loom", "crossbeam-utils/loom"] | ||
| parking_lot = ["dep:parking_lot"] |
There was a problem hiding this comment.
fixed — build.rs now emits a compile_error! naming the missing feature for both broken cases (no model checker + no parking_lot, default slot + no arc-swap). Verified: --no-default-features fails loudly; default / loom / shuttle / shuttle_pct / shuttle_dfs / _strict_provenance configs all build cleanly.
`quiescent` was a single-file QSBR primitive that depended on the
concurrency crate's `sync` module and on `arc-swap` for its publication
slot. Its slot implementation already duplicated what concurrency
now exposes under `concurrency::slot`, and the crate had no callers
outside its own test suite. Merge it in so it lives next to the
primitives it builds on.
Moves:
* `quiescent/src/lib.rs` -> `concurrency/src/quiescent.rs`, imports
rewritten to `crate::{sync,slot}`, duplicate `mod slot` dropped in
favour of the shared `concurrency::slot`.
* `quiescent/README.md` -> `concurrency/QUIESCENT.md`, surfaced as
the module-level rustdoc via `#![doc = include_str!]`.
* `quiescent/tests/{loom,shuttle,protocol,properties}.rs` ->
`concurrency/tests/quiescent_*.rs` (preserves the integration-test
boundary so each model checker stays in its own compilation unit).
* `quiescent/tests/__fuzz__/` is gitignored, so the bolero corpus is
not tracked.
Cargo:
* `static_assertions` becomes a concurrency dep (for the load-bearing
auto-trait assertions on `Publisher` / `Subscriber`).
* `bolero` becomes a concurrency dev-dep (used by the moved property
and shuttle tests).
* Workspace drops the `quiescent` member, the workspace `quiescent`
dependency entry, and the `[workspace.metadata.package.quiescent]`
block.
CI:
* The two `quiescent/permissive` and `quiescent/strict` miri steps are
replaced by a single "extra important" miri step that exercises the
concurrency-sensitive crates (concurrency, flow-entry, flow-filter,
nat, routing) under tighter scheduler parameters.
* Both the loom and shuttle feature matrix entries scope to
`test_package: "concurrency"`. The model-check features flip
`concurrency::sync` workspace-wide via Cargo feature unification, so
non-concurrency crates' regular `#[test]` functions that use the
facade end up calling shuttle/loom primitives outside a
`check_*`/`model` callback and panic. Only the concurrency crate's
own tests (`quiescent_*`, `loom_*`, `scope_*`, `stress_dispatch`)
are written against those backends.
* `justfile` drops the `binary(loom)` / `shuttle` nextest filters
entirely. Backend selection is driven by `#![cfg(...)]` headers in
each test file, which produce empty binaries for the wrong backend
-- nextest runs those with zero discoverable tests, so name-based
filtering on top is redundant and was already silently stale (the
filter pattern was tied to file names like `quiescent_loom.rs` that
this PR then renamed). Leaving the filter empty lets the source-level
cfg gates be the single source of truth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Adds `concurrency::crossbeam`, mirroring the parking_lot-style facade pattern of `concurrency::sync`: a backend-routed module that exposes a uniform surface for `Backoff`, `CachePadded`, `sync::ShardedLock`, and `sync::WaitGroup` regardless of which model checker (if any) is active. * Default backend re-exports `crossbeam_utils` directly -- zero wrapping cost. * Fallback (loom / shuttle* / miri) ships a hand-rolled implementation routed through `crate::sync` so the model checker observes the synchronization. `ShardedLock` collapses to a single `RwLock`; the sharding doesn't matter for model checking. `sync::mod.rs` routes its `ShardedLock` / `WaitGroup` re-exports through the new local `crate::crossbeam` module so callers continue to import them from `concurrency::sync`. `tests/crossbeam_facade.rs` exercises the surface under whichever backend is active (`Backoff` progression, `CachePadded` shape, `ShardedLock` reader/writer interleaving, `WaitGroup` rendezvous). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Replaces the earlier feature-flag soup (`feature = "loom"` / `feature = "shuttle"` / `_silence_clippy` combos) with a single `cfg(concurrency = ...)` key set by `build.rs` based on which feature is selected. Downstream code branches on one key instead of juggling multi-feature `#[cfg(...)]` expressions. `build.rs` defines five backend values: * `concurrency = "default"` -- production (parking_lot + std::sync) * `concurrency = "loom"` -- loom model checker * `concurrency = "shuttle"` -- shuttle randomized scheduler * `concurrency = "shuttle_pct"` -- shuttle PCT scheduler (NEW) * `concurrency = "shuttle_dfs"` -- shuttle DFS scheduler (NEW) When multiple backend features are selected at once, `loom > shuttle_dfs > shuttle_pct > shuttle` precedence applies. `--all-features` exercises this; downstream crates that conditionally pull us in shouldn't have to deduplicate. `src/lib.rs` and `src/macros.rs` collapse the old multi-cfg arms into single-key `cfg(concurrency = "...")` branches. `with_loom!`, `with_shuttle!`, and `with_std!` macros now gate on the cfg key rather than the feature names. `sync/mod.rs` and `sync/test_facade.rs` extend their loom/shuttle gating to cover `shuttle_pct` and `shuttle_dfs` too -- without this, those two backends fail to compile against the test facade. The `_strict_provenance` feature is retained from the pre-refactor state. It forces the `Mutex<Arc<T>>` slot fallback even when no model-checker feature is set, so the fallback slot can be exercised under miri (which by default now runs against `ArcSwap` -- the production path -- under permissive provenance). Without this opt-in, removing `miri` from the build.rs force-fallback list would drop fallback-slot coverage under miri entirely; the loom/shuttle suites exercise the fallback in other ways, but a separate miri-with-fallback run guards against strict-provenance regressions in the fallback path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Adds `concurrency::slot::SlotOption<T>`, the `Option<Arc<T>>` analogue
of `Slot<T>`. Same backend split as `Slot`:
* Default backend wraps `arc_swap::ArcSwapOption<T>` for the lock-free
read fast path.
* Fallback (loom / shuttle / miri) wraps `Mutex<Option<Arc<T>>>` so
the model checker can model the publication and miri can audit
provenance.
API: `empty()`, `from_pointee(Into<Option<T>>)`, `new(Option<Arc<T>>)`,
`load_full() -> Option<Arc<T>>`, `swap`, `store`. `Debug` and `Default`
impls on both shapes.
Unblocks migrating `ArcSwapOption` call sites
(`pipeline::sample_nfs::PacketDumper::filter`,
`nat::stateful::allocator_writer::NatAllocator{Writer,Reader}`) to the
facade so they can be model-checked or run under miri.
`common::cliprovider` gains `impl CliDataProvider for Slot<T>` and
`impl CliDataProvider for SlotOption<T>` so CLI-source registration
keeps working after those call-site migrations land. The existing
`ArcSwap`/`ArcSwapOption` impls stay (still used by `config::gwconfig`).
`common` picks up `concurrency` as a workspace dep for these impls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Adds two related pieces of test-time infrastructure so model-check
tests can be written once and run under any backend.
`concurrency::stress(body)` -- backend-routed runner that dispatches
to `loom::model`, `shuttle::check_random`/`_pct`/`_dfs`, or just calls
`body()` directly on the default backend.
`#[concurrency::test]` -- a `concurrency_macros` proc macro that
expands to `#[test]` plus `concurrency::stress(|| { body })`. One
source file, five execution strategies, no `loom::model(|| { ... })`
boilerplate at call sites. Rejects `async fn` and `fn(args)` at parse
time with a clear error. The macro path is resolved at expansion time
via `proc-macro-crate`, so consumers depending on `dataplane-concurrency`
under any alias work without extra setup; integration tests inside
this crate use `extern crate dataplane_concurrency as concurrency;`
since cargo rejects self-deps.
`concurrency::thread` becomes a module (was a `pub use` alias)
exposing the active backend's `thread::*` items. Under loom it also
includes a local `thread::scope` shim built on `loom::spawn` +
`park` + `unpark` and an `Arc<ScopeData>` counter -- the same shape
`std::thread::scope` uses internally. A narrow `mem::transmute` lifts
the spawned closure's `'scope` to `'static` (loom's `spawn` requires
`'static`); the join-before-return contract makes that sound. Loom's
`spawn` also requires `T: 'static`, so the spawned closure is wrapped
to write its result into an `Arc<Mutex<Option<T>>>` that
`ScopedJoinHandle::join` reads back.
`tests/quiescent_model.rs` replaces `tests/quiescent_loom.rs`. Each
test is now a plain function annotated with `#[concurrency::test]`;
the body uses `thread::scope` instead of the `Box::into_raw` +
`&'static` lifetime workaround the old loom-only file needed. Single
source runs under any backend, including the four-way schedule
exploration shuttle offers.
One test (`snapshot_after_publish_observes_published`) is single-
threaded by design; PCT panics on closures that don't exercise
concurrency, so it's `#[cfg(not(concurrency = "shuttle_pct"))]`-gated.
Direct coverage for the new infrastructure:
* `tests/loom_scope.rs` -- contract tests for `thread::scope`
(single/multi-spawn join, lifetime borrows, drop affinity,
nested-scope re-entry). Pins shim regressions at the right layer
rather than as misbehaving quiescent tests.
* `tests/scope_property.rs` -- bolero x shuttle property test
generating random spawn-count / per-spawn-ops plans; pins the
"scope conservation" invariant (counter at scope return equals
the sum of generated increments) across many shapes and
interleavings.
* `tests/stress_dispatch.rs` -- coarse dispatch-routing check:
the default backend invokes the body exactly once; the
model-check backends invoke it more than once.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Sweeps `std::sync` / `std::thread` / `arc_swap::ArcSwapOption` usage
in the hot data-path crates to `concurrency::*`. The migrated code is
now model-checkable: switch `dataplane-concurrency` to `--features
loom` or `--features shuttle*` and the same source paths exercise
under the chosen scheduler.
`net`:
* `flows::flow_info` -- atomics (`AtomicI64`/`AtomicU8`) onto
`concurrency::sync::atomic`.
* `flows::display`, `packet::stats`, `flows::atomic_instant` -- same.
`nat`:
* `common`, `portfw::nf`, `portfw::portfwtable::{objects,
portforwarder}`, `icmp_handler::nf` -- `Arc`/`Weak`/atomics onto
the facade.
* `stateful::apalloc::test_alloc` -- the `#[concurrency_mode(shuttle)]`
smoke test was importing `shuttle::sync::{Arc, Mutex}` directly;
now uses `concurrency::sync::{Arc, Mutex}` and `concurrency::thread`
via the facade. The `parking_lot`-shaped lock returns naked guards,
so `.lock().unwrap()` collapses to `.lock()`.
* `stateful::allocator_writer` -- `Arc<ArcSwapOption<NatAllocator>>`
becomes `Arc<SlotOption<NatAllocator>>`. `NatAllocatorReader::inner`
returns the `SlotOption` for `CliDataProvider` consumption.
`pipeline`:
* `pipeline`, `static_nf`, `sample_nfs` -- atomics + Arc routed
through the facade.
* `sample_nfs::PacketDumper` -- `filter: ArcSwapOption<...>` becomes
`filter: SlotOption<...>`.
* `Cargo.toml` -- swaps `arc-swap` for `concurrency`; the migration
fully covers the prior `arc_swap` call sites, so the direct dep
is no longer needed.
`routing`:
* `atable::resolver` -- atomics + `std::thread::spawn`/`JoinHandle`
via `concurrency::thread`.
* `fib::fibtable` -- `Arc` via the facade.
`concurrency` (own tests):
* `tests/quiescent_protocol.rs` and `tests/quiescent_properties.rs`
now use `dataplane_concurrency::sync` / `::thread` for consistency
with the rest of the crate.
The drop-recording slot in the quiescent loom test stays on
`std::sync::Mutex` deliberately (its existing comment explains the
model checker doesn't need to model contention on it).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
f0de528 to
32f90ca
Compare
Prep for DPDK thread rework.
I'm trying to organize the concurrency crate. I should have absorbed quiescent into it in the first place.