feature: improve provenance and make q2-preview editable#231
Draft
gordonwoodhull wants to merge 20 commits into
Draft
feature: improve provenance and make q2-preview editable#231gordonwoodhull wants to merge 20 commits into
gordonwoodhull wants to merge 20 commits into
Conversation
Audit and revise Plans 3-8 of the q2-preview series (now framed
internally as the provenance epic) after a design discussion that
followed the q2-preview pipeline and attribution work landing on main.
Major design changes folded into the plans:
- **Plan 4 unified Generated variant.** Collapse the earlier
`Synthetic` + `Derived` split into one `Generated { by, anchors: Vec<Anchor> }`
shape. Atomicity is per-`by.kind` (orthogonal to anchors); the
invocation source byte range is the first anchor with role
`AnchorRole::Invocation`. One wire-format code (4) instead of two.
- **Plan 4/5/6 typed anchors (Path C).** Instead of stuffing
source-info chain metadata into `by.data` (dynamic JSON), the chain
is a typed `Vec<Anchor>` where each `Anchor` carries an `Arc<SourceInfo>`
and a role-labeled `AnchorRole` (`Invocation`, `ValueSource`,
`Other(String)`). `by.data` shrinks to per-kind non-source-info
configuration. Two future-anchor roles flagged as follow-ups
contingent on metadata-loader and Lua-file-registration work.
- **Plan 6 uniform shortcode anchor stamping.** Single funnel covers
Rust built-ins, Lua-loaded extension handlers, and user-extension
shortcodes uniformly via a post-walk `stamp_shortcode_anchors` helper.
Enrichment-via-post-walk preserves Lua-attached `by.data` fields
(lua_path, lua_line) while promoting `by.kind` to `shortcode`.
Attribution interaction documented: multi-author shortcodes get
latest-wins via the existing `query_byte_range` max-time logic
composed with chain-walking through the `Invocation` anchor.
- **Plan 5 latent code-3 bug now reachable.** Plans 1-2 shipped the
q2-preview pipeline that runs filters whose output crosses the JSON
boundary; the FilterProvenance code-3 round-trip bug is no longer
latent in production. Added end-to-end production-reachability
regression test using the `{{< kbd Ctrl+C >}}` fixture (kbd.lua
constructs a Span that gets FilterProvenance-tagged and then
shortcode-stamped). Drops code 5 from the design.
- **Plan 7 SPA edit-back in scope.** The new q2 preview CLI command
serves a separate SPA from ts-packages/preview-renderer; both
hub-client and the SPA share the writer machinery via @quarto/preview-runtime.
Plan 7 now covers replacing `noopSetAst` in the SPA with a real
handler that routes through `incrementalWriteQmd` to
`syncClient.updateFileContent` and the ephemeral hub's automerge↔disk
bridge. Adds a small SPA-local `DiagnosticStrip` for Q-3-42/Q-3-43;
hub-client's existing diagnostics-banner handles the same warnings
there. Single-file mode (bd-tnm3k) works through the same automerge
stack — no special case.
- **Plan 8 wrapper stays Original.** Explicit reasoning added for
why `CustomNode("IncludeExpansion")` uses Original source_info
(CustomNode.type_name carries generator identity; the wrapper
substitutes 1:1 for the source-mapped Paragraph). HTML pipeline
resolve transform in the Normalization Phase (symmetric with
CalloutResolveTransform); HTML doesn't attribute the include line
because there's no DOM anchor for it — accepted v1 behavior.
Mechanical changes also folded in:
- Rename `Synthetic` → `Generated` throughout the type vocabulary in
all plans.
- Update JS-side hand-mirror file paths (`hub-client/src/utils/...`
→ `ts-packages/preview-renderer/src/utils/...`) to reflect the
Phase-D package split.
- Each plan's intro reframed as part of the provenance epic; file
names keep the q2-preview-plan-N form for continuity.
File renames for clarity about which filters each plan covers:
- `…plan-3-filter-idempotence.md` → `…plan-3-builtin-filter-idempotence.md`
- `…plan-7a-filter-idempotence.md` → `…plan-7a-user-filter-idempotence.md`
Plans 3-8 remain in design state on this branch; no code changes yet.
Audit pass over the provenance epic's idempotence story, scoping Plan 3 to pipeline non-determinism only and propagating the consequences to the neighbouring plans. Plan 3 (builtin transform and filter idempotence): - Retitle to "Built-in transform and filter idempotence verification" — symmetric across Rust transforms and Lua filters (prior framing was too narrow). - Enumerate the actual universe under test: 36 Rust transforms in build_q2_preview_transform_pipeline (4 excluded, named with reasons), ~20 stage-level items in build_q2_preview_pipeline_stages, and the one Lua filter under resources/extensions/ (video-filter.lua). The prior "~10-20 filters" estimate misread shortcodes as filters. - Drop the "Plan 3 strengthening" round-trip amendment that was added alongside Plan 7a in commit 2129d35. Round-trip non-idempotence is not exercised by today's pipeline; CI-time round-trip testing conflates writer-lossiness with filter-non-idempotence; 7a's runtime check is the better home for the property when Plan 7's writer ships. Trim "Two flavors" section to a pointer at 7a. - Add compute_meta_hash_fresh / compute_meta_hash_fresh_excluding_rendered as a new helper in quarto-ast-reconcile, parallel to the existing block hasher. Hash covers blocks + meta (excluding rendered.*). - Rewrite test pseudocode against the real run_pipeline API at pipeline.rs:626. - Add fixture-format constraint: no executable engine cells (CI has no kernels). - Coverage gap audit: ~25 fixtures across the document-level, Lua shortcode, website-project, attribution, and resource categories. Includes lua-shortcode-version, lua-shortcode-lipsum-fixed (non-random path), and video-filter-header for the one built-in Lua filter. - Convert to a development-plan format with a seven-phase work-items checklist. - Close the engine-staleness open question via filter.rs:158 (fresh Lua::new() per invocation). - Clarify the lua-filter-pipeline reference as TypeScript Quarto porting material, not the Rust inventory. Plan 6 (provenance audit): - Add a §Test plan bullet for source_info determinism: Plan 3's hashes exclude source_info by design, so a per-fixture source_info-equality check is Plan 6's own responsibility. Plan 7 (incremental writer): - Add a writer-lossless baseline test as the first §Test plan bullet, prerequisite for the reconciler tests. Reuses Plan 3's fixture set. - Add Plan 3 to §References and §Dependencies (soft-depends-on via compute_meta_hash_fresh). Plan 7a (runtime user-filter idempotence): - Remove all references to the now-deleted "Plan 3 strengthening" section (five locations including a full subsection). - Reframe the out-of-scope bullet from "Strengthening Plan 3" to "Extending the runtime round-trip check to built-in filters," with three-point v1-acceptance reasoning in §Notes. - Update §Design decisions, §Dependencies, and §References to reflect the new shape and the shared compute_meta_hash_fresh helper. - Add the meta-hash comparison to step 4 of the round-trip check. No code changes; design state only.
…ailure policy
Hash helper: `merge_op` participates (verified `MergeOp::default() =
Concat` is a stable compile-time constant); `Map` entries hashed in
insertion order, no sort (an idempotence test should *catch* the kind
of HashMap-iteration-order non-determinism a sort would mask). Adds
regression-guard unit tests for both choices.
Test runner: drives every fixture through both `DriveMode::SingleFile`
(direct `run_pipeline`) and `DriveMode::ProjectOrchestrator`
(`ProjectPipeline<RenderToPreviewAstRenderer>`) so orchestrator-only
non-determinism (project discovery, ProjectIndex assembly, file-iteration
order) is also under test. Website/chrome fixtures are
orchestrator-only by design.
Failure policy: failing fixtures stay **failing** — no auto-`#[ignore]`.
Each failure files a beads issue whose description doubles as a
sub-agent investigation prompt. The integration branch holds the
queue; merge to main waits until drained or the user explicitly opts
to ignore.
New helper `find_first_divergence` (alongside the hashers) returns
`DivergencePoint::{Block { index }, MetaKey { path }, None}` so the
test driver's panic message — and therefore the sub-agent prompt —
arrives with a concrete starting point instead of just "hash diverged."
Orchestrator-mode `DocumentAst` extraction: researched the data flow;
the typed AST is materialized inside `render_qmd_to_preview_ast` but
discarded after JSON serialization. Plan recommends adding `pub ast:
DocumentAst` to `PreviewAstOutput` and forwarding through
`WasmPassTwoOutput`; alternatives (JSON re-parse, test-only hook)
documented with their costs.
Fixture rules: no absolute process paths in fixture content (built-in
extensions extract to a `temp_dir` whose path differs across CI runs;
stable within a single process — fine for two-runs-compare, but a
latent issue for future stored-snapshot variants).
Smaller corrections: `Format::from_format_string("q2-preview")` (no
`Format::q2_preview()` constructor exists); `apply_lua_filter`
(singular) is the per-filter Lua-state-creation site, with the plural
loop calling it once per filter; `LuaShortcodeEngine::new` is the
shortcode-side analogue; `quarto/video` filter extension is built-in
via `include_dir!(resources/extensions)` and auto-discovered by
`StageContext::new`, so fixtures need no scaffolding beyond `filters:
[video]` in YAML; `meta.rendered.includes.*` is the actual path
(not `meta.includes.*`) and includes contributions from
`IncludeResolveStage`, chrome render transforms, `attribution_viewer`,
and Bootstrap/clipboard injection — all skipped by
`compute_meta_hash_fresh_excluding_rendered`.
Stage-inventory clarifications: `MathJsStage` is excluded from
q2-preview; `BootstrapJsStage` and `ClipboardJsStage` write only to
`ctx.artifacts` (not to `meta` or `blocks`), so they don't affect the
hash — but their q2-preview inclusion is questionable and is filed
separately as bd-2ag1c.
Notes for the next traversal: `CodeHighlightStage`'s native disk scan
for user grammars is OS-order-dependent (not exercised today;
fixtures don't supply user grammars); lipsum's module-load
`math.randomseed(os.time())` is harmless on the non-random code path
the fixture exercises but should be reverified if a future variant
routes through `math.random`.
Estimated scope: ~760 → ~980 lines.
…branch policy
Audit pass against current source. Settles every open question that
remained in the prior revision and corrects factual drift.
Reuse over rebuild
- `DriveMode::ProjectOrchestrator` now delegates to the existing
`render_active_page_preview` helper at
`crates/quarto-core/tests/render_page_in_project.rs:660`. No fresh
orchestrator wiring; no `make_website_project_ctx(...)` builder.
- `DocumentAst` extraction settled on option (a): re-parse the JSON
via `pampa::readers::json::read`. source_info round-trips but the
hash excludes it, so no stripping pass and no production plumbing
change is required. Earlier option (b) (typed-AST plumbing through
`PreviewAstOutput` / `WasmPassTwoOutput`) abandoned.
- `run_orchestrator` code sample updated: real body in place of the
prior `unimplemented!("see Open questions")` stub.
Test crate location pinned
- File: `crates/quarto-core/tests/idempotence.rs`.
- Fixtures: `crates/quarto-core/tests/fixtures/idempotence/`.
- Cargo invocation in the sub-agent prompt template updated to
`--test idempotence`.
Long-lived branch policy made explicit
- New `## Long-lived branch policy` section at the top.
- `## Goal` clarifies that "CI-enforced" applies when the plan lands
on `main`; until then `feature/provenance` is allowed to be red
while the failure queue drains.
- `### Phase 5 — Failure triage` opens with the same constraint.
Factual fixes against current source
- Transform count corrected from 36 to 37; missing
`table-bootstrap-class` added to Finalization, with a fixture
entry in the gap audit and Phase 4 checklist.
- `Q2_PREVIEW_STAGE_EXCLUDED` corrected to list all three exclusions
(`math-js`, `render-html-body`, `apply-template`).
- `CodeHighlightStage` user-grammar scan citation moved from
`pipeline.rs:644-650` to
`crates/quarto-core/src/transforms/code_highlight.rs:126-129`.
- Stale line numbers refreshed throughout (pipeline.rs 1181→1198,
1220→1237, 379→380, 355→356, 626→627, 855→859, 663→664;
render_page_in_project.rs 653→660; Pass2Payload::AstJson 256→254;
stage/context.rs 220→221; ShortcodeResolveTransform::transform
257→513 with the correct file path).
- bd-2ag1c ordering pinned: Plan 3 lands first; bd-2ag1c follows
with Plan 3's measurements in hand.
Section rename: "Open questions for implementation" →
"Decisions (was: open questions)" + a `### CI failure policy &
sub-agent prompt template` subsection. All internal cross-refs
updated.
Estimate revised
- Scaffolding line item: ~260 → ~100 lines (reuse, not rebuild).
- `PreviewAstOutput::ast` plumbing (~20 lines) removed entirely.
- Total: ~980 → ~800 lines.
- Session count revised 2 → 2-3 with the third explicitly allocated
to Phase 5 triage.
Adds the structural-hash infrastructure that Plan 3's q2-preview idempotence gate (and Plan 7a's runtime user-filter check) will sit on: - compute_meta_hash_fresh: source-info-agnostic ConfigValue hasher. Insertion-order Map keys (no sort, so HashMap-iteration-order bugs in transforms remain detectable). MergeOp participates via its enum discriminant. Recurses into PandocInlines/PandocBlocks via the existing inline/block hashers (which already exclude source_info). - compute_meta_hash_fresh_excluding_rendered: same, but skips the top-level `rendered` map entry. The exclusion is intentionally not propagated into recursion: a nested `rendered` key is content. - find_first_divergence + DivergencePoint: returns the first block index whose per-block fresh hash differs, or the first insertion- order meta key path whose subtree hash differs (with the same rendered.* exclusion). The plan-sketch signature took &DocumentAst, but quarto-ast-reconcile cannot depend on quarto-core; the helper takes &[Block] + &ConfigValue and the test driver projects from DocumentAst. - 11 new unit tests cover: same/different content, source_info/ key_source agnosticism, top-level rendered exclusion, nested rendered participation, Map insertion-order sensitivity (no-sort regression guard), MergeOp sensitivity; identical/Block-mismatch/ MetaKey-path/rendered-skip divergence localization. Verification: `cargo nextest run --workspace` — 9321 passed, 196 skipped. `cargo xtask verify --skip-hub-build` steps 1–5 green (lint, fmt, Rust build with -D warnings, tree-sitter, Rust tests with -D warnings). Steps 7/10 fail with the known --skip-hub-build artifact (`wasm-quarto-hub-client` unbuilt), unrelated to these additive Rust changes. Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md
Adds the test driver that Phases 3-4 will hang ~25 fixtures off.
Self-contained at `crates/quarto-core/tests/idempotence.rs`.
- `DriveMode { SingleFile, ProjectOrchestrator }`. Single-file calls
`run_pipeline` with `build_q2_preview_pipeline_stages`. Orchestrator
drives `ProjectPipeline<RenderToPreviewAstRenderer>` via the existing
`render_active_page_preview` body (copied inline because each
`tests/*.rs` is its own binary).
- `Fixture { name, setup, active, modes }` + `run_fixture` runs the
pipeline twice per (fixture, mode), hashes blocks via
`compute_blocks_hash_fresh` and meta via
`compute_meta_hash_fresh_excluding_rendered`, and on divergence
panics with `find_first_divergence`'s `DivergencePoint` embedded so
the panic message itself fills the plan's sub-agent investigation
prompt template.
- `pandoc_to_document_ast` is the small field-shuffle that the plan
identifies: orchestrator mode emits `Pass2Payload::AstJson`, which
`pampa::readers::json::read` re-parses into `(Pandoc, ASTContext)`;
the hasher only reads `ast.blocks` + `ast.meta` so the other
`DocumentAst` fields get defaults.
- `tests/fixtures/idempotence/README.md` documents the fixture-format
rules (no engine cells, no absolute paths, per-fixture mode mapping).
- `smoke_plain_paragraph` smoke fixture drives a single-paragraph
document through both modes. Passing this proves the harness works
end-to-end before Phases 3-4 land the real fixtures.
Verification: `cargo nextest run -p quarto-core --test idempotence`
runs the new smoke test (PASS). `cargo xtask verify
--skip-hub-build --skip-hub-tests` steps 1-9 green; the Phase-1
idempotence tests and this Phase-2 smoke test ran inside Step 5.
Step 10 (preview-renderer integration tests in
`ts-packages/preview-renderer/`) fails with the same WASM-import
artifact as Step 7 — both depend on `wasm-quarto-hub-client` which
`--skip-hub-build` skips. Unrelated to these Rust-only additions.
Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md
Adds the existing-fixture batch the plan calls "carry-forward from prior plan draft": one fixture per Rust transform / feature that was already exercised in earlier idempotence drafts, scoped to single-file document fixtures that run in both DriveMode variants. Coverage: - meta-single, meta-markdown — shortcode-resolve + metadata-normalize (string and PandocInlines branches). - include-trivial — include-expansion stage + shortcode-resolve. - callout-warning — CalloutTransform (callout-resolve is excluded from q2-preview, so the CustomNode survives). - theorem — TheoremSugarTransform. - figure-ref-target — FloatRefTargetSugarTransform. - crossref-to-theorem — crossref-index + crossref-resolve. - sectionize-multi — SectionizeTransform across nested headers. - footnotes-mixed — FootnotesTransform on inline + reference forms. - appendix-license — AppendixStructureTransform with license/ copyright meta and a footnote interaction. - combined-stress — sectionize + callouts + shortcodes interacting. A `doc_fixture(name, content)` helper collapses each single-file fixture to a one-liner; `include-trivial` keeps an inline closure because it writes two files. All 12 idempotence tests (smoke + 11 new) pass: `cargo nextest run -p quarto-core --test idempotence` → 12 passed. No queue entries for Phase 5 from this batch — the carry-forward fixtures are all clean on first run. Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md
npm install (from repo root) and npm run build:wasm (from hub-client) updated package-lock.json and crates/wasm-quarto-hub-client/Cargo.lock on this branch. Committed so subsequent fresh checkouts of feature/provenance can build WASM from the same dependency set.
Adds the batch of Phase-4 fixtures that need no scaffolding beyond a single-file `setup`. Per the long-lived-integration-branch policy, fixtures that surface non-idempotence stay in the suite as the triage queue. Pass on first run (both DriveModes): - code-block-fenced — code-block-generate / -render / code-highlight. - proof — ProofSugarTransform. - equation-labeled — EquationLabelTransform + crossref-resolve (eq). - toc-on — toc-generate, toc-render. - video-filter-header — built-in Lua filter under `resources/extensions/quarto/video/`. - theme-bootstrap — compile-theme-css stage. - table-bootstrap-class — TableBootstrapClassTransform. - lua-shortcode-version — Lua-loaded shortcode handler (returns `quarto.version`). In the queue: - **lua-shortcode-lipsum-fixed**: `SingleFile` passes; the pipeline itself is idempotent. `ProjectOrchestrator` panics with `MalformedSourceInfoPool` re-parsing the AST JSON the orchestrator emitted. This is a JSON writer/reader round-trip bug specific to lipsum-shortcode-generated inlines, not a transform-determinism finding. Filed as **bd-3odjm**. The test stays red per the plan's "do not #[ignore]" rule; the integration branch is allowed to carry the failure until the queue is drained. Verification: `cargo nextest run -p quarto-core --test idempotence` → 20 passed, 1 failed (bd-3odjm). Plan-1 unit tests and Phase-3 fixtures all green. Refs: - claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md - bd-3odjm
Both pass on first run in both DriveMode variants. - include-in-header writes a tiny header.html and references it from front matter; exercises IncludeResolveStage. - resource-image writes a 67-byte minimal PNG and references it via inline image syntax; exercises ResourceCollectorTransform. Adds a write_bytes helper for the binary stub. Per the fixtures README rule the PNG sits at the project root and is referenced relatively (`./local.png`). Verification: `cargo nextest run -p quarto-core --test idempotence` → 22 passed, 1 failed (bd-3odjm).
Three orchestrator-only website fixtures. Two pass, one in queue. Pass: - website-chrome — navbar + sidebar + page-navigation + page-footer + favicon + bootstrap-icons + canonical-url + title-prefix. Two pages (index, other), tiny favicon stub. - website-listing — listing with categories enabled and feed: true, two posts under posts/, each with categories. Exercises listing-generate / -render, categories-sidebar, listing-feed-link, listing-feed-stage, listing-item-info. In the queue: - website-links — internal cross-page `.qmd` body links. Filed as bd-rz2we. Block 0 hash diverges across runs while meta hash is stable, so the divergence is genuinely in the AST blocks (not in rendered chrome). Hypothesis: link-rewrite or link-resolution is capturing the absolute project root (or canonicalized tempdir path) into the AST when it should emit a path-independent relative URL. Verification: `cargo nextest run -p quarto-core --test idempotence` → 24 passed, 2 failed (bd-3odjm, bd-rz2we). Refs: - claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md - bd-rz2we
Extends Fixture with an optional attribution_json: Option<&'static str>. When present: - SingleFile installs PreBuiltAttributionProvider on RenderContext.attribution_provider before run_pipeline. - ProjectOrchestrator forwards the JSON via RenderToPreviewAstRenderer::with_attribution; the renderer installs the same provider type on the per-page RenderContext it constructs internally. Stub JSON has one actor + one run covering bytes 0..1024 (a wider range than the fixture body actually uses) so the attribution map overlaps the entire document and AttributionGenerateStage + AttributionRenderTransform have something to write into the AST. `cargo nextest run -p quarto-core --test idempotence` → 25 passed, 2 failed (bd-3odjm, bd-rz2we — both pre-existing). attribution_basic passes on first run in both DriveModes, so the deterministic provider + generate + render stack is genuinely idempotent. This completes the Phase 4 fixture set. The Plan-3 gate now covers: - 1 smoke fixture - 11 carry-forward (Phase 3, all green) - 9 Phase-4a doc fixtures (8 green, 1 in queue) - 2 Phase-4b multi-file (both green) - 3 Phase-4c website (2 green, 1 in queue) - 1 Phase-4d attribution (green) Total: 27 fixtures, 25 green, 2 in queue. Refs: - claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md - bd-3odjm (Plan 5 will fix), bd-rz2we
Adds claude-notes/instructions/idempotence-contract.md — the author-facing summary of the contract Plan 3 enforces. Covers: - what the hash includes and excludes (source-info blind, insertion-order maps, merge_op participates, rendered.* excluded at top level only); - what new transforms must NOT do (undefined iteration order, process-local state, absolute paths, engine cells); - the fresh-Lua-state-per-run rule for Lua filters / shortcodes; - how to add a fixture (doc_fixture for trivial, inline closure for multi-file, ORCHESTRATOR_ONLY for chrome, attribution_json for attribution exercises); - the long-lived-integration-branch policy: don't #[ignore] a failing fixture without explicit user approval. Cross-linked from: - crates/quarto-core/tests/fixtures/idempotence/README.md (existing pointer expanded to point at the contract doc and the plan). - claude-notes/plans/2026-05-04-q2-preview-plan-7a-user-filter-idempotence.md (References section — authors looking at the runtime user-filter check find the CI contract too). Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md
cargo nextest run --workspace: 9346/9348 pass. The 2 failures are the documented queue items (bd-3odjm, bd-rz2we); every other workspace test is green, including the 25 passing idempotence fixtures. cargo xtask verify (full WASM stack): Steps 1-4 green; Step 5 fails on the same 2 fixtures. That's the expected long-lived- integration-branch state per the plan's §Long-lived branch policy — the gate is allowed to be red until the queue is drained. Plan 3 is complete as a deliverable: gate + hashing infrastructure + 27 fixtures + author-facing docs + filed queue. Merge to main gated on draining the queue (bd-3odjm via Plan 5; bd-rz2we via a follow-up). Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md
The Work-items section under Phase 1-7 was fully checked, but the parallel "Coverage gaps to address during implementation" inventory (per-fixture bullets, line ~560+) still showed unchecked boxes even though every fixture in that list now ships in idempotence.rs. Marked all 26 inventory items as landed. Annotated the two that are in the Phase-5 triage queue (lipsum-fixed → bd-3odjm, website-links → bd-rz2we) so the queue state is also visible from the inventory, not just from the Phase-5 work-items block. Plan checklist is now fully consistent: 54 checked, 0 unchecked.
…erContext
Plan 3's website_links fixture was non-idempotent: rendered AST link
URLs captured the absolute tempdir path of the per-run TempDir,
causing block-0 hash divergence across two runs with different
tempdirs. Root cause: `ResourceResolverContext::vfs_root_mode`
played two roles via a single PathBuf — disk-write root (where
runtime.file_write puts theme CSS / copied resources) and URL
prefix (what gets embedded in HTML link/asset URLs). In production
WASM these are intentionally identical; on native they have to
diverge so writes hit a real tempdir but URLs stay path-independent.
Split the field into `{ write_root, url_root }` and add a two-arg
`vfs_root_with_url_root` constructor plus per-renderer
`with_url_root` builder. Single-arg `vfs_root(...)` constructor
preserves the WASM identity contract by construction (write_root ==
url_root). Native test helpers in tests/idempotence.rs and
tests/render_page_in_project.rs now pass
`.with_url_root("/.quarto/project-artifacts")`, so rendered URLs
embed the synthetic prefix while disk writes still land in the
tempdir.
website_links now passes; 25/26 idempotence fixtures pass. The
remaining lipsum failure is bd-3odjm (FilterProvenance wire
format), owned by Plan 5 and out of scope here. Workspace nextest:
9347/9348. cargo xtask verify (Rust leg) clean for lint/fmt/build
with -D warnings.
Plan: claude-notes/plans/2026-05-21-vfs-url-write-root-split.md
Plan 4 (SourceInfo provenance types) finalized for development: - 7-phase work-items checklist (types → constructors → accessor updates → Lua serde → migration → tests → verification gate) - field renamed `anchors` → `from` (typed `SmallVec<[Anchor; 1]>` from day 1; serde feature required on smallvec) - accessor semantics for `Generated` pinned: length/start_offset/ end_offset → 0, map_offset → None, resolve_byte_range / remap_file_ids / extract_file_id delegate to invocation_anchor - required-Invocation-anchor invariant on `shortcode` kind documented with `By::shortcode` doc-comment requirement; enforcement split across Plan 6 audit test and Plan 7 debug_assert - Lua-table discriminant pinned to `t = "Generated"` - §Test plan and Phase 6 expanded to cover every accessor + mutator + the `combine()` × Generated corner - migration scope corrected (15 files, 27 occurrences); references and line ranges verified against the worktree source - §Open questions section removed (no open questions remain) Cross-plan `from` rename swept across Plans 3, 5, 6, 7, 8. Plan 5 JSON wire format (option D): - outer JSON key `anchors` → `from` (matches Rust field name) - inner anchor pool reference `from` → `si_id` (distinctive; avoids the `parent_id` tree-structure mental model that fits Substring's chain but not anchor references) - Reader/writer code samples updated; TS-side `SourceInfoEntry` shape note updated Plan 6 + Plan 7 hand-offs for the required-anchor invariant added. Deferred follow-ups (Dispatch anchor, ValueSource anchor) cross- referenced as bd-36fr9 and bd-129m3 (committed separately to main).
Plan 4 work happens on top of an integration branch carrying exactly one failing test (lua_shortcode_lipsum_fixed orchestrator mode, filed as bd-3odjm). That test's root cause is the wire-format code-3 collision Plan 5 owns, so Plan 4 must not try to fix it locally. Plan 4: - New §"Inherited pre-existing failure (bd-3odjm)" section between Out of scope and Work items. Explains the test, the panic shape, the root cause, and that any *other* failure in the idempotence suite is a Plan-4 regression. - Phase 7 verification gate updated: cargo nextest expects exactly one failure (bd-3odjm); cargo xtask verify trips on the same one. Plan 5: - New §"Inherited failure that must close on Plan 5's first reader change (bd-3odjm)" section. Spells out the contract: Plan 5's first reader change must turn lua_shortcode_lipsum_fixed green. If it doesn't, the Plan-5 author has an immediate signal that either the reader discrimination is wrong or the lipsum path produces a code-3 shape neither arm handles — stop and focus on it before moving on. - Test plan now cites bd-3odjm as the live first-iteration smoke check, ahead of the hand-constructed tests. Both plans now read consistently with the state of feature/provenance.
Plan 4 committed `from: SmallVec<[Anchor; 1]>` as the field type, but Plan 5's reader/writer + Plan 6's stamper code samples still used the `vec![]` macro to construct it. Those samples would not compile if taken literally — `vec!` produces a `Vec`, not a `SmallVec`. Switch to `smallvec![]` everywhere `Generated.from` is constructed: - Plan 5: 4 occurrences (legacy-Transformed code-3 reader; Anchor dedup test description; forward-compat test description; round- trip test description). - Plan 6: 14 occurrences across §"Per-transform fixes", §"Lua-shortcode enrichment", §"The post-walk helper", §"Variant semantics summary" etc. No semantic change — same constructions, just the macro that actually returns the field type.
Plan 4 + Plan 5: change Generated.from's inline capacity from SmallVec<[Anchor; 1]> to SmallVec<[Anchor; 2]> so the steady-state post-follow-up shape (Invocation + ValueSource on meta/var; Invocation + Dispatch on Lua-handler shortcodes) stays heap-free. Cost is +16 bytes per empty Generated; saves a heap allocation on every multi-anchor shortcode resolution. Also folds in research findings that were tacit in the previous draft: - Phase 1 smallvec line: replace "or verify present" hedge with the concrete two-file Cargo.toml edit (workspace + quarto-source-map), noting verified-absent. - skip_serializing_if path: use the fully-qualified serde_json::Value::is_null (the short form is a frequent gotcha). - By::raw policy: accept-all; forgery caught by Plan 6 audit + Plan 7 debug_assert, not by constructor rejection. - Anchor ordering: append order, stable across serde, at most one anchor per known role. - extract_file_id: empty-from Generated returns None, matching FilterProvenance's behavior; both call sites in to_ariadne_report already tolerate None. Stays a private fn on DiagnosticMessage. - Lua serde Concat recursion: legacy "FilterProvenance" inside a Concat piece is handled automatically; no .snap/.json fixtures contain the legacy tag. - Default risk: no struct holding SourceInfo derives Default in quarto-pandoc-types; Default for SourceInfo itself stays unchanged. - combine() × Generated: verified unreachable today (all 17 call sites combine Original/Substring shapes); the Phase 6 test documents intent for any future caller. - PartialEq: no production call site compares SourceInfo today; the derive is required by Block/Inline but not load-bearing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is very early. Creating a draft PR for CI and in case anyone is curious.
Will push as things progress. The provenance epic is Plans 3-8 of the q2-preview sequence.
Current status: first plan proving idempotence of all built-in transforms and shortcodes.
One expected failure until Plan 5 is complete, due to a latent bug in the wire format.