feature: improve provenance and make q2-preview editable by gordonwoodhull · Pull Request #231 · quarto-dev/q2

gordonwoodhull · 2026-05-22T00:00:25Z

Draft PR for CI, not working yet.

The provenance epic is Plans 3-8 of the q2-preview sequence.

Current status: plans 3-5 complete

Next up: Audit every transform that emits SourceInfo::default() (a meaningless zero-range Original) and fix it to emit correct provenance.

The hub-client-e2e.yml `paths:` filter only fires the workflow when a commit touches `hub-client/**` or the workflow file itself. It does not follow transitive Rust deps, so PRs that modify upstream crates the WASM bundle depends on — `quarto-core`, `quarto-pandoc-types`, `quarto-source-map`, `pampa`, `quarto-ast-reconcile`, `wasm-quarto-hub-client`, etc. — silently skip e2e. Two recent misses: - f96f56d (Carlos, 5/22): WASM-incompatible `Instant::now()` and `pollster::block_on` introduced in `quarto-core` broke 8 hub-client WASM tests on main. e2e never ran because the change was under `crates/`, not `hub-client/`. - PR #231 (feature/provenance, this branch): 57 files modified across `crates/` and `ts-packages/`, zero under `hub-client/`. e2e silently skipped on every push despite the PR materially changing the WASM bundle's behavior. Fix: drop the `paths:` filter outright and match the trigger shape of the sibling heavy workflows (`test-suite.yml`, `ts-test-suite.yml`). Also adds a `concurrency:` block (lifted from `test-suite.yml`) so superseded runs on a PR get cancelled in flight — keeps the runner cost from compounding. Closes bd-izh3. The original ask there was to add a PR trigger with a *broader* path filter; that approach still wouldn't catch the upstream- crate case, so we go the coarser route the issue's spirit calls for. The runner-sizing open question in bd-izh3 is also resolved — ae8274a confirmed `ubuntu-latest` (2 cores, 2 Playwright workers) handles the full suite in 5.3-8.1 min. `kyoto` deliberately omitted from the branch list: `origin/kyoto` last moved 2026-02-02 and is 825 commits behind main; the sibling workflows still reference it but that's cargo-cult.

…nce) bd-izh3 closed by 016894a on feature/provenance (PR #231). The patch drops the hub-client-e2e.yml path filter outright so the workflow fires on every PR like the sibling heavy workflows — strictly broader than the original 'add PR trigger with broader filter' proposal, since path filters can never follow transitive Rust deps. Incidental: bd-cxara has its 'source_repo_path' field stripped (was a stale absolute path from shikokuchuo's local clone; harmless flush).

Audit and revise Plans 3-8 of the q2-preview series (now framed internally as the provenance epic) after a design discussion that followed the q2-preview pipeline and attribution work landing on main. Major design changes folded into the plans: - **Plan 4 unified Generated variant.** Collapse the earlier `Synthetic` + `Derived` split into one `Generated { by, anchors: Vec<Anchor> }` shape. Atomicity is per-`by.kind` (orthogonal to anchors); the invocation source byte range is the first anchor with role `AnchorRole::Invocation`. One wire-format code (4) instead of two. - **Plan 4/5/6 typed anchors (Path C).** Instead of stuffing source-info chain metadata into `by.data` (dynamic JSON), the chain is a typed `Vec<Anchor>` where each `Anchor` carries an `Arc<SourceInfo>` and a role-labeled `AnchorRole` (`Invocation`, `ValueSource`, `Other(String)`). `by.data` shrinks to per-kind non-source-info configuration. Two future-anchor roles flagged as follow-ups contingent on metadata-loader and Lua-file-registration work. - **Plan 6 uniform shortcode anchor stamping.** Single funnel covers Rust built-ins, Lua-loaded extension handlers, and user-extension shortcodes uniformly via a post-walk `stamp_shortcode_anchors` helper. Enrichment-via-post-walk preserves Lua-attached `by.data` fields (lua_path, lua_line) while promoting `by.kind` to `shortcode`. Attribution interaction documented: multi-author shortcodes get latest-wins via the existing `query_byte_range` max-time logic composed with chain-walking through the `Invocation` anchor. - **Plan 5 latent code-3 bug now reachable.** Plans 1-2 shipped the q2-preview pipeline that runs filters whose output crosses the JSON boundary; the FilterProvenance code-3 round-trip bug is no longer latent in production. Added end-to-end production-reachability regression test using the `{{< kbd Ctrl+C >}}` fixture (kbd.lua constructs a Span that gets FilterProvenance-tagged and then shortcode-stamped). Drops code 5 from the design. - **Plan 7 SPA edit-back in scope.** The new q2 preview CLI command serves a separate SPA from ts-packages/preview-renderer; both hub-client and the SPA share the writer machinery via @quarto/preview-runtime. Plan 7 now covers replacing `noopSetAst` in the SPA with a real handler that routes through `incrementalWriteQmd` to `syncClient.updateFileContent` and the ephemeral hub's automerge↔disk bridge. Adds a small SPA-local `DiagnosticStrip` for Q-3-42/Q-3-43; hub-client's existing diagnostics-banner handles the same warnings there. Single-file mode (bd-tnm3k) works through the same automerge stack — no special case. - **Plan 8 wrapper stays Original.** Explicit reasoning added for why `CustomNode("IncludeExpansion")` uses Original source_info (CustomNode.type_name carries generator identity; the wrapper substitutes 1:1 for the source-mapped Paragraph). HTML pipeline resolve transform in the Normalization Phase (symmetric with CalloutResolveTransform); HTML doesn't attribute the include line because there's no DOM anchor for it — accepted v1 behavior. Mechanical changes also folded in: - Rename `Synthetic` → `Generated` throughout the type vocabulary in all plans. - Update JS-side hand-mirror file paths (`hub-client/src/utils/...` → `ts-packages/preview-renderer/src/utils/...`) to reflect the Phase-D package split. - Each plan's intro reframed as part of the provenance epic; file names keep the q2-preview-plan-N form for continuity. File renames for clarity about which filters each plan covers: - `…plan-3-filter-idempotence.md` → `…plan-3-builtin-filter-idempotence.md` - `…plan-7a-filter-idempotence.md` → `…plan-7a-user-filter-idempotence.md` Plans 3-8 remain in design state on this branch; no code changes yet.

Audit pass over the provenance epic's idempotence story, scoping Plan 3 to pipeline non-determinism only and propagating the consequences to the neighbouring plans. Plan 3 (builtin transform and filter idempotence): - Retitle to "Built-in transform and filter idempotence verification" — symmetric across Rust transforms and Lua filters (prior framing was too narrow). - Enumerate the actual universe under test: 36 Rust transforms in build_q2_preview_transform_pipeline (4 excluded, named with reasons), ~20 stage-level items in build_q2_preview_pipeline_stages, and the one Lua filter under resources/extensions/ (video-filter.lua). The prior "~10-20 filters" estimate misread shortcodes as filters. - Drop the "Plan 3 strengthening" round-trip amendment that was added alongside Plan 7a in commit 2129d35. Round-trip non-idempotence is not exercised by today's pipeline; CI-time round-trip testing conflates writer-lossiness with filter-non-idempotence; 7a's runtime check is the better home for the property when Plan 7's writer ships. Trim "Two flavors" section to a pointer at 7a. - Add compute_meta_hash_fresh / compute_meta_hash_fresh_excluding_rendered as a new helper in quarto-ast-reconcile, parallel to the existing block hasher. Hash covers blocks + meta (excluding rendered.*). - Rewrite test pseudocode against the real run_pipeline API at pipeline.rs:626. - Add fixture-format constraint: no executable engine cells (CI has no kernels). - Coverage gap audit: ~25 fixtures across the document-level, Lua shortcode, website-project, attribution, and resource categories. Includes lua-shortcode-version, lua-shortcode-lipsum-fixed (non-random path), and video-filter-header for the one built-in Lua filter. - Convert to a development-plan format with a seven-phase work-items checklist. - Close the engine-staleness open question via filter.rs:158 (fresh Lua::new() per invocation). - Clarify the lua-filter-pipeline reference as TypeScript Quarto porting material, not the Rust inventory. Plan 6 (provenance audit): - Add a §Test plan bullet for source_info determinism: Plan 3's hashes exclude source_info by design, so a per-fixture source_info-equality check is Plan 6's own responsibility. Plan 7 (incremental writer): - Add a writer-lossless baseline test as the first §Test plan bullet, prerequisite for the reconciler tests. Reuses Plan 3's fixture set. - Add Plan 3 to §References and §Dependencies (soft-depends-on via compute_meta_hash_fresh). Plan 7a (runtime user-filter idempotence): - Remove all references to the now-deleted "Plan 3 strengthening" section (five locations including a full subsection). - Reframe the out-of-scope bullet from "Strengthening Plan 3" to "Extending the runtime round-trip check to built-in filters," with three-point v1-acceptance reasoning in §Notes. - Update §Design decisions, §Dependencies, and §References to reflect the new shape and the shared compute_meta_hash_fresh helper. - Add the meta-hash comparison to step 4 of the round-trip check. No code changes; design state only.

…ailure policy Hash helper: `merge_op` participates (verified `MergeOp::default() = Concat` is a stable compile-time constant); `Map` entries hashed in insertion order, no sort (an idempotence test should *catch* the kind of HashMap-iteration-order non-determinism a sort would mask). Adds regression-guard unit tests for both choices. Test runner: drives every fixture through both `DriveMode::SingleFile` (direct `run_pipeline`) and `DriveMode::ProjectOrchestrator` (`ProjectPipeline<RenderToPreviewAstRenderer>`) so orchestrator-only non-determinism (project discovery, ProjectIndex assembly, file-iteration order) is also under test. Website/chrome fixtures are orchestrator-only by design. Failure policy: failing fixtures stay **failing** — no auto-`#[ignore]`. Each failure files a beads issue whose description doubles as a sub-agent investigation prompt. The integration branch holds the queue; merge to main waits until drained or the user explicitly opts to ignore. New helper `find_first_divergence` (alongside the hashers) returns `DivergencePoint::{Block { index }, MetaKey { path }, None}` so the test driver's panic message — and therefore the sub-agent prompt — arrives with a concrete starting point instead of just "hash diverged." Orchestrator-mode `DocumentAst` extraction: researched the data flow; the typed AST is materialized inside `render_qmd_to_preview_ast` but discarded after JSON serialization. Plan recommends adding `pub ast: DocumentAst` to `PreviewAstOutput` and forwarding through `WasmPassTwoOutput`; alternatives (JSON re-parse, test-only hook) documented with their costs. Fixture rules: no absolute process paths in fixture content (built-in extensions extract to a `temp_dir` whose path differs across CI runs; stable within a single process — fine for two-runs-compare, but a latent issue for future stored-snapshot variants). Smaller corrections: `Format::from_format_string("q2-preview")` (no `Format::q2_preview()` constructor exists); `apply_lua_filter` (singular) is the per-filter Lua-state-creation site, with the plural loop calling it once per filter; `LuaShortcodeEngine::new` is the shortcode-side analogue; `quarto/video` filter extension is built-in via `include_dir!(resources/extensions)` and auto-discovered by `StageContext::new`, so fixtures need no scaffolding beyond `filters: [video]` in YAML; `meta.rendered.includes.*` is the actual path (not `meta.includes.*`) and includes contributions from `IncludeResolveStage`, chrome render transforms, `attribution_viewer`, and Bootstrap/clipboard injection — all skipped by `compute_meta_hash_fresh_excluding_rendered`. Stage-inventory clarifications: `MathJsStage` is excluded from q2-preview; `BootstrapJsStage` and `ClipboardJsStage` write only to `ctx.artifacts` (not to `meta` or `blocks`), so they don't affect the hash — but their q2-preview inclusion is questionable and is filed separately as bd-2ag1c. Notes for the next traversal: `CodeHighlightStage`'s native disk scan for user grammars is OS-order-dependent (not exercised today; fixtures don't supply user grammars); lipsum's module-load `math.randomseed(os.time())` is harmless on the non-random code path the fixture exercises but should be reverified if a future variant routes through `math.random`. Estimated scope: ~760 → ~980 lines.

…branch policy Audit pass against current source. Settles every open question that remained in the prior revision and corrects factual drift. Reuse over rebuild - `DriveMode::ProjectOrchestrator` now delegates to the existing `render_active_page_preview` helper at `crates/quarto-core/tests/render_page_in_project.rs:660`. No fresh orchestrator wiring; no `make_website_project_ctx(...)` builder. - `DocumentAst` extraction settled on option (a): re-parse the JSON via `pampa::readers::json::read`. source_info round-trips but the hash excludes it, so no stripping pass and no production plumbing change is required. Earlier option (b) (typed-AST plumbing through `PreviewAstOutput` / `WasmPassTwoOutput`) abandoned. - `run_orchestrator` code sample updated: real body in place of the prior `unimplemented!("see Open questions")` stub. Test crate location pinned - File: `crates/quarto-core/tests/idempotence.rs`. - Fixtures: `crates/quarto-core/tests/fixtures/idempotence/`. - Cargo invocation in the sub-agent prompt template updated to `--test idempotence`. Long-lived branch policy made explicit - New `## Long-lived branch policy` section at the top. - `## Goal` clarifies that "CI-enforced" applies when the plan lands on `main`; until then `feature/provenance` is allowed to be red while the failure queue drains. - `### Phase 5 — Failure triage` opens with the same constraint. Factual fixes against current source - Transform count corrected from 36 to 37; missing `table-bootstrap-class` added to Finalization, with a fixture entry in the gap audit and Phase 4 checklist. - `Q2_PREVIEW_STAGE_EXCLUDED` corrected to list all three exclusions (`math-js`, `render-html-body`, `apply-template`). - `CodeHighlightStage` user-grammar scan citation moved from `pipeline.rs:644-650` to `crates/quarto-core/src/transforms/code_highlight.rs:126-129`. - Stale line numbers refreshed throughout (pipeline.rs 1181→1198, 1220→1237, 379→380, 355→356, 626→627, 855→859, 663→664; render_page_in_project.rs 653→660; Pass2Payload::AstJson 256→254; stage/context.rs 220→221; ShortcodeResolveTransform::transform 257→513 with the correct file path). - bd-2ag1c ordering pinned: Plan 3 lands first; bd-2ag1c follows with Plan 3's measurements in hand. Section rename: "Open questions for implementation" → "Decisions (was: open questions)" + a `### CI failure policy & sub-agent prompt template` subsection. All internal cross-refs updated. Estimate revised - Scaffolding line item: ~260 → ~100 lines (reuse, not rebuild). - `PreviewAstOutput::ast` plumbing (~20 lines) removed entirely. - Total: ~980 → ~800 lines. - Session count revised 2 → 2-3 with the third explicitly allocated to Phase 5 triage.

Adds the structural-hash infrastructure that Plan 3's q2-preview idempotence gate (and Plan 7a's runtime user-filter check) will sit on: - compute_meta_hash_fresh: source-info-agnostic ConfigValue hasher. Insertion-order Map keys (no sort, so HashMap-iteration-order bugs in transforms remain detectable). MergeOp participates via its enum discriminant. Recurses into PandocInlines/PandocBlocks via the existing inline/block hashers (which already exclude source_info). - compute_meta_hash_fresh_excluding_rendered: same, but skips the top-level `rendered` map entry. The exclusion is intentionally not propagated into recursion: a nested `rendered` key is content. - find_first_divergence + DivergencePoint: returns the first block index whose per-block fresh hash differs, or the first insertion- order meta key path whose subtree hash differs (with the same rendered.* exclusion). The plan-sketch signature took &DocumentAst, but quarto-ast-reconcile cannot depend on quarto-core; the helper takes &[Block] + &ConfigValue and the test driver projects from DocumentAst. - 11 new unit tests cover: same/different content, source_info/ key_source agnosticism, top-level rendered exclusion, nested rendered participation, Map insertion-order sensitivity (no-sort regression guard), MergeOp sensitivity; identical/Block-mismatch/ MetaKey-path/rendered-skip divergence localization. Verification: `cargo nextest run --workspace` — 9321 passed, 196 skipped. `cargo xtask verify --skip-hub-build` steps 1–5 green (lint, fmt, Rust build with -D warnings, tree-sitter, Rust tests with -D warnings). Steps 7/10 fail with the known --skip-hub-build artifact (`wasm-quarto-hub-client` unbuilt), unrelated to these additive Rust changes. Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md

Adds the test driver that Phases 3-4 will hang ~25 fixtures off. Self-contained at `crates/quarto-core/tests/idempotence.rs`. - `DriveMode { SingleFile, ProjectOrchestrator }`. Single-file calls `run_pipeline` with `build_q2_preview_pipeline_stages`. Orchestrator drives `ProjectPipeline<RenderToPreviewAstRenderer>` via the existing `render_active_page_preview` body (copied inline because each `tests/*.rs` is its own binary). - `Fixture { name, setup, active, modes }` + `run_fixture` runs the pipeline twice per (fixture, mode), hashes blocks via `compute_blocks_hash_fresh` and meta via `compute_meta_hash_fresh_excluding_rendered`, and on divergence panics with `find_first_divergence`'s `DivergencePoint` embedded so the panic message itself fills the plan's sub-agent investigation prompt template. - `pandoc_to_document_ast` is the small field-shuffle that the plan identifies: orchestrator mode emits `Pass2Payload::AstJson`, which `pampa::readers::json::read` re-parses into `(Pandoc, ASTContext)`; the hasher only reads `ast.blocks` + `ast.meta` so the other `DocumentAst` fields get defaults. - `tests/fixtures/idempotence/README.md` documents the fixture-format rules (no engine cells, no absolute paths, per-fixture mode mapping). - `smoke_plain_paragraph` smoke fixture drives a single-paragraph document through both modes. Passing this proves the harness works end-to-end before Phases 3-4 land the real fixtures. Verification: `cargo nextest run -p quarto-core --test idempotence` runs the new smoke test (PASS). `cargo xtask verify --skip-hub-build --skip-hub-tests` steps 1-9 green; the Phase-1 idempotence tests and this Phase-2 smoke test ran inside Step 5. Step 10 (preview-renderer integration tests in `ts-packages/preview-renderer/`) fails with the same WASM-import artifact as Step 7 — both depend on `wasm-quarto-hub-client` which `--skip-hub-build` skips. Unrelated to these Rust-only additions. Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md

Adds the existing-fixture batch the plan calls "carry-forward from prior plan draft": one fixture per Rust transform / feature that was already exercised in earlier idempotence drafts, scoped to single-file document fixtures that run in both DriveMode variants. Coverage: - meta-single, meta-markdown — shortcode-resolve + metadata-normalize (string and PandocInlines branches). - include-trivial — include-expansion stage + shortcode-resolve. - callout-warning — CalloutTransform (callout-resolve is excluded from q2-preview, so the CustomNode survives). - theorem — TheoremSugarTransform. - figure-ref-target — FloatRefTargetSugarTransform. - crossref-to-theorem — crossref-index + crossref-resolve. - sectionize-multi — SectionizeTransform across nested headers. - footnotes-mixed — FootnotesTransform on inline + reference forms. - appendix-license — AppendixStructureTransform with license/ copyright meta and a footnote interaction. - combined-stress — sectionize + callouts + shortcodes interacting. A `doc_fixture(name, content)` helper collapses each single-file fixture to a one-liner; `include-trivial` keeps an inline closure because it writes two files. All 12 idempotence tests (smoke + 11 new) pass: `cargo nextest run -p quarto-core --test idempotence` → 12 passed. No queue entries for Phase 5 from this batch — the carry-forward fixtures are all clean on first run. Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md

npm install (from repo root) and npm run build:wasm (from hub-client) updated package-lock.json and crates/wasm-quarto-hub-client/Cargo.lock on this branch. Committed so subsequent fresh checkouts of feature/provenance can build WASM from the same dependency set.

Adds the batch of Phase-4 fixtures that need no scaffolding beyond a single-file `setup`. Per the long-lived-integration-branch policy, fixtures that surface non-idempotence stay in the suite as the triage queue. Pass on first run (both DriveModes): - code-block-fenced — code-block-generate / -render / code-highlight. - proof — ProofSugarTransform. - equation-labeled — EquationLabelTransform + crossref-resolve (eq). - toc-on — toc-generate, toc-render. - video-filter-header — built-in Lua filter under `resources/extensions/quarto/video/`. - theme-bootstrap — compile-theme-css stage. - table-bootstrap-class — TableBootstrapClassTransform. - lua-shortcode-version — Lua-loaded shortcode handler (returns `quarto.version`). In the queue: - **lua-shortcode-lipsum-fixed**: `SingleFile` passes; the pipeline itself is idempotent. `ProjectOrchestrator` panics with `MalformedSourceInfoPool` re-parsing the AST JSON the orchestrator emitted. This is a JSON writer/reader round-trip bug specific to lipsum-shortcode-generated inlines, not a transform-determinism finding. Filed as **bd-3odjm**. The test stays red per the plan's "do not #[ignore]" rule; the integration branch is allowed to carry the failure until the queue is drained. Verification: `cargo nextest run -p quarto-core --test idempotence` → 20 passed, 1 failed (bd-3odjm). Plan-1 unit tests and Phase-3 fixtures all green. Refs: - claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md - bd-3odjm

Both pass on first run in both DriveMode variants. - include-in-header writes a tiny header.html and references it from front matter; exercises IncludeResolveStage. - resource-image writes a 67-byte minimal PNG and references it via inline image syntax; exercises ResourceCollectorTransform. Adds a write_bytes helper for the binary stub. Per the fixtures README rule the PNG sits at the project root and is referenced relatively (`./local.png`). Verification: `cargo nextest run -p quarto-core --test idempotence` → 22 passed, 1 failed (bd-3odjm).

Three orchestrator-only website fixtures. Two pass, one in queue. Pass: - website-chrome — navbar + sidebar + page-navigation + page-footer + favicon + bootstrap-icons + canonical-url + title-prefix. Two pages (index, other), tiny favicon stub. - website-listing — listing with categories enabled and feed: true, two posts under posts/, each with categories. Exercises listing-generate / -render, categories-sidebar, listing-feed-link, listing-feed-stage, listing-item-info. In the queue: - website-links — internal cross-page `.qmd` body links. Filed as bd-rz2we. Block 0 hash diverges across runs while meta hash is stable, so the divergence is genuinely in the AST blocks (not in rendered chrome). Hypothesis: link-rewrite or link-resolution is capturing the absolute project root (or canonicalized tempdir path) into the AST when it should emit a path-independent relative URL. Verification: `cargo nextest run -p quarto-core --test idempotence` → 24 passed, 2 failed (bd-3odjm, bd-rz2we). Refs: - claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md - bd-rz2we

Extends Fixture with an optional attribution_json: Option<&'static str>. When present: - SingleFile installs PreBuiltAttributionProvider on RenderContext.attribution_provider before run_pipeline. - ProjectOrchestrator forwards the JSON via RenderToPreviewAstRenderer::with_attribution; the renderer installs the same provider type on the per-page RenderContext it constructs internally. Stub JSON has one actor + one run covering bytes 0..1024 (a wider range than the fixture body actually uses) so the attribution map overlaps the entire document and AttributionGenerateStage + AttributionRenderTransform have something to write into the AST. `cargo nextest run -p quarto-core --test idempotence` → 25 passed, 2 failed (bd-3odjm, bd-rz2we — both pre-existing). attribution_basic passes on first run in both DriveModes, so the deterministic provider + generate + render stack is genuinely idempotent. This completes the Phase 4 fixture set. The Plan-3 gate now covers: - 1 smoke fixture - 11 carry-forward (Phase 3, all green) - 9 Phase-4a doc fixtures (8 green, 1 in queue) - 2 Phase-4b multi-file (both green) - 3 Phase-4c website (2 green, 1 in queue) - 1 Phase-4d attribution (green) Total: 27 fixtures, 25 green, 2 in queue. Refs: - claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md - bd-3odjm (Plan 5 will fix), bd-rz2we

Adds claude-notes/instructions/idempotence-contract.md — the author-facing summary of the contract Plan 3 enforces. Covers: - what the hash includes and excludes (source-info blind, insertion-order maps, merge_op participates, rendered.* excluded at top level only); - what new transforms must NOT do (undefined iteration order, process-local state, absolute paths, engine cells); - the fresh-Lua-state-per-run rule for Lua filters / shortcodes; - how to add a fixture (doc_fixture for trivial, inline closure for multi-file, ORCHESTRATOR_ONLY for chrome, attribution_json for attribution exercises); - the long-lived-integration-branch policy: don't #[ignore] a failing fixture without explicit user approval. Cross-linked from: - crates/quarto-core/tests/fixtures/idempotence/README.md (existing pointer expanded to point at the contract doc and the plan). - claude-notes/plans/2026-05-04-q2-preview-plan-7a-user-filter-idempotence.md (References section — authors looking at the runtime user-filter check find the CI contract too). Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md

cargo nextest run --workspace: 9346/9348 pass. The 2 failures are the documented queue items (bd-3odjm, bd-rz2we); every other workspace test is green, including the 25 passing idempotence fixtures. cargo xtask verify (full WASM stack): Steps 1-4 green; Step 5 fails on the same 2 fixtures. That's the expected long-lived- integration-branch state per the plan's §Long-lived branch policy — the gate is allowed to be red until the queue is drained. Plan 3 is complete as a deliverable: gate + hashing infrastructure + 27 fixtures + author-facing docs + filed queue. Merge to main gated on draining the queue (bd-3odjm via Plan 5; bd-rz2we via a follow-up). Refs: claude-notes/plans/2026-05-04-q2-preview-plan-3-builtin-filter-idempotence.md

The Work-items section under Phase 1-7 was fully checked, but the parallel "Coverage gaps to address during implementation" inventory (per-fixture bullets, line ~560+) still showed unchecked boxes even though every fixture in that list now ships in idempotence.rs. Marked all 26 inventory items as landed. Annotated the two that are in the Phase-5 triage queue (lipsum-fixed → bd-3odjm, website-links → bd-rz2we) so the queue state is also visible from the inventory, not just from the Phase-5 work-items block. Plan checklist is now fully consistent: 54 checked, 0 unchecked.

…erContext Plan 3's website_links fixture was non-idempotent: rendered AST link URLs captured the absolute tempdir path of the per-run TempDir, causing block-0 hash divergence across two runs with different tempdirs. Root cause: `ResourceResolverContext::vfs_root_mode` played two roles via a single PathBuf — disk-write root (where runtime.file_write puts theme CSS / copied resources) and URL prefix (what gets embedded in HTML link/asset URLs). In production WASM these are intentionally identical; on native they have to diverge so writes hit a real tempdir but URLs stay path-independent. Split the field into `{ write_root, url_root }` and add a two-arg `vfs_root_with_url_root` constructor plus per-renderer `with_url_root` builder. Single-arg `vfs_root(...)` constructor preserves the WASM identity contract by construction (write_root == url_root). Native test helpers in tests/idempotence.rs and tests/render_page_in_project.rs now pass `.with_url_root("/.quarto/project-artifacts")`, so rendered URLs embed the synthetic prefix while disk writes still land in the tempdir. website_links now passes; 25/26 idempotence fixtures pass. The remaining lipsum failure is bd-3odjm (FilterProvenance wire format), owned by Plan 5 and out of scope here. Workspace nextest: 9347/9348. cargo xtask verify (Rust leg) clean for lint/fmt/build with -D warnings. Plan: claude-notes/plans/2026-05-21-vfs-url-write-root-split.md

Plan 4 (SourceInfo provenance types) finalized for development: - 7-phase work-items checklist (types → constructors → accessor updates → Lua serde → migration → tests → verification gate) - field renamed `anchors` → `from` (typed `SmallVec<[Anchor; 1]>` from day 1; serde feature required on smallvec) - accessor semantics for `Generated` pinned: length/start_offset/ end_offset → 0, map_offset → None, resolve_byte_range / remap_file_ids / extract_file_id delegate to invocation_anchor - required-Invocation-anchor invariant on `shortcode` kind documented with `By::shortcode` doc-comment requirement; enforcement split across Plan 6 audit test and Plan 7 debug_assert - Lua-table discriminant pinned to `t = "Generated"` - §Test plan and Phase 6 expanded to cover every accessor + mutator + the `combine()` × Generated corner - migration scope corrected (15 files, 27 occurrences); references and line ranges verified against the worktree source - §Open questions section removed (no open questions remain) Cross-plan `from` rename swept across Plans 3, 5, 6, 7, 8. Plan 5 JSON wire format (option D): - outer JSON key `anchors` → `from` (matches Rust field name) - inner anchor pool reference `from` → `si_id` (distinctive; avoids the `parent_id` tree-structure mental model that fits Substring's chain but not anchor references) - Reader/writer code samples updated; TS-side `SourceInfoEntry` shape note updated Plan 6 + Plan 7 hand-offs for the required-anchor invariant added. Deferred follow-ups (Dispatch anchor, ValueSource anchor) cross- referenced as bd-36fr9 and bd-129m3 (committed separately to main).

Plan 4 work happens on top of an integration branch carrying exactly one failing test (lua_shortcode_lipsum_fixed orchestrator mode, filed as bd-3odjm). That test's root cause is the wire-format code-3 collision Plan 5 owns, so Plan 4 must not try to fix it locally. Plan 4: - New §"Inherited pre-existing failure (bd-3odjm)" section between Out of scope and Work items. Explains the test, the panic shape, the root cause, and that any *other* failure in the idempotence suite is a Plan-4 regression. - Phase 7 verification gate updated: cargo nextest expects exactly one failure (bd-3odjm); cargo xtask verify trips on the same one. Plan 5: - New §"Inherited failure that must close on Plan 5's first reader change (bd-3odjm)" section. Spells out the contract: Plan 5's first reader change must turn lua_shortcode_lipsum_fixed green. If it doesn't, the Plan-5 author has an immediate signal that either the reader discrimination is wrong or the lipsum path produces a code-3 shape neither arm handles — stop and focus on it before moving on. - Test plan now cites bd-3odjm as the live first-iteration smoke check, ahead of the hand-constructed tests. Both plans now read consistently with the state of feature/provenance.

Plan 4 committed `from: SmallVec<[Anchor; 1]>` as the field type, but Plan 5's reader/writer + Plan 6's stamper code samples still used the `vec![]` macro to construct it. Those samples would not compile if taken literally — `vec!` produces a `Vec`, not a `SmallVec`. Switch to `smallvec![]` everywhere `Generated.from` is constructed: - Plan 5: 4 occurrences (legacy-Transformed code-3 reader; Anchor dedup test description; forward-compat test description; round- trip test description). - Plan 6: 14 occurrences across §"Per-transform fixes", §"Lua-shortcode enrichment", §"The post-walk helper", §"Variant semantics summary" etc. No semantic change — same constructions, just the macro that actually returns the field type.

Plan 4 + Plan 5: change Generated.from's inline capacity from SmallVec<[Anchor; 1]> to SmallVec<[Anchor; 2]> so the steady-state post-follow-up shape (Invocation + ValueSource on meta/var; Invocation + Dispatch on Lua-handler shortcodes) stays heap-free. Cost is +16 bytes per empty Generated; saves a heap allocation on every multi-anchor shortcode resolution. Also folds in research findings that were tacit in the previous draft: - Phase 1 smallvec line: replace "or verify present" hedge with the concrete two-file Cargo.toml edit (workspace + quarto-source-map), noting verified-absent. - skip_serializing_if path: use the fully-qualified serde_json::Value::is_null (the short form is a frequent gotcha). - By::raw policy: accept-all; forgery caught by Plan 6 audit + Plan 7 debug_assert, not by constructor rejection. - Anchor ordering: append order, stable across serde, at most one anchor per known role. - extract_file_id: empty-from Generated returns None, matching FilterProvenance's behavior; both call sites in to_ariadne_report already tolerate None. Stays a private fn on DiagnosticMessage. - Lua serde Concat recursion: legacy "FilterProvenance" inside a Concat piece is handled automatically; no .snap/.json fixtures contain the legacy tag. - Default risk: no struct holding SourceInfo derives Default in quarto-pandoc-types; Default for SourceInfo itself stays unchanged. - combine() × Generated: verified unreachable today (all 17 call sites combine Original/Substring shapes); the Phase 6 test documents intent for any future caller. - PartialEq: no production call site compares SourceInfo today; the derive is required by Block/Inline but not load-bearing.

The previous "+16 bytes per Generated" note understated the cost by ~2.5x. Actual delta: - Anchor = AnchorRole (32 bytes — String-bearing Other variant dominates) + Arc<SourceInfo> (8) = ~40 bytes. - SmallVec<[Anchor; 1]> ≈ 48 bytes; SmallVec<[Anchor; 2]> ≈ 88 bytes on the stack — a 40-byte delta per SmallVec field. - Since SourceInfo is an enum, its stack size is dictated by the largest variant, so every SourceInfo (Original/Substring/Concat too) grows by 40 bytes — not just Generated instances. Block/Inline carry SourceInfo by value, so the cost multiplies across the AST (tens-to-hundreds of KB on a large doc). Plan keeps cap=2 — the trade is still defensible — but documents the real cost honestly and notes Arc-boxing Generated as the next lever if memory-per-node ever bites the q2-preview editor.

Review of Plan 5 against `feature/provenance` @ 5bea4d0. The plan was implementable but underspecified for an unsupervised pass; this commit folds in the gaps surfaced by the review. Plan structure - Phase-ordered Work items checklist (Phase 0 start gate → Phase 7 verification gate). Phase 1 lands the bd-3odjm fix on its own; the rest of the phases compile cleanly between steps. - Phase 2 corrected: Plan 4's interim writer arm keeps `SerializableSourceMapping::FilterProvenance` and routes Generated through it via `by.as_filter().expect(...)`. Plan 5 Phase 2 removes both the variant and the interim arm together — the workspace stays buildable across the handoff. Wire-format clarifications - §"TypeScript wire-format definitions" added with explicit before/after. Code 4's `d` becomes `{ by; from? }`; code 5 (Synthetic/Derived churn) is removed entirely; code 3's `d` widens to a union for the dual-shape legacy reader. - Renamed `anchor_pool_ids` → `from` in writer pseudocode so the field name is consistent across the user-facing, writer-internal, and wire layers. Three-forms-one-name callout added to Design decisions. - Verbose-key trade-off justified explicitly. Implementation-precision adds - `arc_parent_ids` cache reuse for anchor dedup (same key shape as `Substring.parent`). - Topological intern order pinned: anchors interned before their parent Generated entry (mirrors today's Concat/Substring arms; reader's `si_id < current_index` guard requires it). - `r: [0, 0]` rule made explicit at the writer-side intern tuple, not just in JSON examples. - Anchor dedup test clarified as hand-constructed (does not depend on Plan 6 shipping the resolver). - Streaming-writer test coverage explicit (Phase 6). - bd-3odjm reopen policy: new failure modes file fresh issues. Fixes - Risk areas: the streaming-writer paragraph named the wrong functions (`write_custom_block` etc. handle CustomNodes, not the pool). Now names `to_json` + `stream_write_source_info_pool` correctly. - References: line numbers refreshed against the current branch; added pointer to `stream_write_source_info_pool`. - `SmallVec::<[Anchor; 1]>::new()` in reader pseudocode bumped to cap=2 to match Plan 4's recent capacity bump.

Folds in research findings from Plan 4's pre-execution review: * Adds Phase 3 work to introduce SourceInfo::root_file_id() and SourceInfo::collect_file_ids() accessors, retiring six ad-hoc walkers across diagnostic.rs, location.rs, pipe_table.rs, section.rs, apply_template.rs, engine_execution.rs. Also fixes a latent nested-Substring bug in pipe_table.rs / section.rs that silently fell back to FileId(0). * Drops the deprecated SourceInfo::filter_provenance alias from Phase 5 — only 4 callers exist, all migrated inline in one PR. * Specifies the JSON writer's transitional Generated arm so the writer stays exhaustive while emitting the legacy code-3 payload until Plan 5 takes over wire-code 4. * Adds the test-pattern guard-form template for the empty-bind pattern-match sites in lua/diagnostics.rs and filter_tests.rs. * Updates §Risk areas, §Estimated scope, §References, §Dependencies, and Phase 7 verification gate to reflect the consolidation. Downstream plans (5, 6, 7, 7a, 8) reference only the published Plan 4 surface (Generated, By, invocation_anchor, is_atomic_kind); none touched. Status line drops "(open questions named)" — all resolved.

Folds review findings into the Plan-5 doc: - Phase-ordered Work items checklist (Phase 0 start gate → Phase 7 verification gate). - TypeScript wire-format definitions §section with explicit before/after (code 4 narrows to `{by; from?}`, code 5 removed, code 3 widens to the dual-shape union). - Phase 2 corrected to match Plan 4's interim writer arm: Plan 5 Phase 2 removes both `SerializableSourceMapping::FilterProvenance` and Plan 4's `by.as_filter().expect(...)` arm together. - Three-forms-one-name callout for `from` across user-facing / writer-internal / wire layers. - arc_parent_ids cache reuse and topological intern order pinned explicitly in Phase 2. - Anchor dedup test clarified as hand-constructed (no Plan 6 dependency). - Streaming-writer test coverage made explicit. - bd-3odjm reopen policy: new failure modes file fresh issues. - Risk areas: fixed wrong function names (was naming CustomNode emit paths, not pool emit paths). - References: line numbers refreshed against current branch.

Adds `SourceInfo::Generated { by: By, from: SmallVec<[Anchor; 2]> }` as the unified provenance variant covering filter constructions, shortcode resolutions, sectionize/footnotes/appendix wrappers, title-block h1, and tree-sitter postprocess spaces. Removes the `FilterProvenance` variant. Type surface in `quarto-source-map`: - `By { kind: String, data: serde_json::Value }` with builders for the ten known kinds (`filter`, `sectionize`, `user_edit`, `shortcode`, `include`, `title_block`, `footnotes`, `appendix`, `tree_sitter_postprocess`, `raw`) plus `is_atomic_kind`, `is_kind`, `as_filter`. - `Anchor { role: AnchorRole, source_info: Arc<SourceInfo> }` with `Invocation` / `ValueSource` / `Other(String)` roles. - `SourceInfo` accessors: `generated`, `invocation_anchor`, `value_source_anchor`, `anchors_with_role`, `append_anchor`, `root_file_id`, `collect_file_ids`. - `resolve_byte_range` for `Generated` delegates to the first `Invocation` anchor and recurses; `remap_file_ids` walks every anchor's source_info via `Arc::make_mut`. File-id walker consolidation: - Six ad-hoc walkers retired (`diagnostic.rs`, `location.rs`, `pipe_table.rs`, `section.rs`, `apply_template.rs` test, `engine_execution.rs` test) onto the two new `SourceInfo` methods. Fixes a latent nested-Substring `FileId(0)` fall-through in `pipe_table.rs` and `section.rs` along the way. Lua serde extension: - `source_info_to_lua_table` / `_from_lua_table` gain a `Generated` arm with `t = "Generated"`, `by` sub-table (`data` JSON-encoded for Lua transit) and `from` array. The legacy `"FilterProvenance"` tag is still accepted by the reader and folds to `Generated { by: filter, from: [] }`; writers never emit it. Migration: - All 27 `SourceInfo::FilterProvenance` pattern sites and 4 non-source-map `filter_provenance(...)` constructor callers migrated. Renamed `test_filter_provenance_tracking` → `test_filter_generated_tracking` with updated assertions. - JSON writer emits legacy code-3 from filter-kind `Generated` so bd-3odjm's expected failure mode is preserved until Plan 5 ships wire-code 4. Verification: - `cargo build --workspace`: clean. - `cargo nextest run --workspace --no-fail-fast`: 9370 passed, 1 failed (`quarto-core::idempotence::lua_shortcode_lipsum_fixed` = bd-3odjm Plan-5 baseline). No other regressions. - `cargo xtask verify --skip-rust-tests`: all 12 steps green (Rust build + hub-client npm install/build/WASM/tests + q2-preview SPA build). - Grep gates green: zero hits for `SourceInfo::FilterProvenance`, `SourceInfo::filter_provenance`, or the retired walker functions. `"FilterProvenance"` string appears only in the legacy Lua reader arm. Dependencies: - Adds `smallvec = "1.13"` with `serde` feature to the workspace, consumed by `quarto-source-map` and `pampa`. - Adds `serde_json` to `quarto-source-map`'s regular deps (was dev-only). Adds ~30 unit tests in `quarto-source-map/src/source_info.rs` covering the new By/Anchor surface, every `Generated` accessor arm, JSON round-trips, and the `combine() × Generated` structural case. Adds a back-compat regression test in `pampa/src/lua/diagnostics.rs` for the legacy `"FilterProvenance"` Lua tag. Plan: claude-notes/plans/2026-05-04-q2-preview-plan-4-source-info-types.md

Adds an "Implementation surprises" section to Plan 4 capturing the six divergences from plan-as-written that came up during landing: - `gen` is a reserved Rust keyword (affects Plan 7's `preimage_in` pseudocode too — flagged for amendment) - Phase 1's "compiles cleanly" applies to quarto-source-map only; workspace stays red until Phase 5 - `extract_filename_index` was tests-only — deleted entirely rather than kept as a shim - `anchors_with_role` had to use `Box<dyn Iterator>` instead of `impl Iterator` (mismatched concrete iterator types per arm) - `cargo xtask verify` also dirties `crates/wasm-quarto-hub-client/Cargo.lock` - bd-3odjm behaved exactly as the plan predicted (non-surprise worth recording as a positive datapoint for plan accuracy)

Replace SourceInfo::default() at each enumerated synthesizer site with the appropriate Generated { by: By::<kind>(), from: [] } shape (or threaded Original for theorem/proof title-attr). - title_block.rs:create_title_header — Generated { by: title_block() } on both the Header and inner Str. - pampa sectionize.rs — Generated { by: sectionize() } on the Section Div (both close-on-stack and end-of-input sites). - footnotes.rs:create_footnotes_section — Generated { by: footnotes() } on the container Div, HorizontalRule, and OrderedList. - appendix.rs — Generated { by: appendix() } on the container Div plus the four structurally-identical helpers (wrap_bibliography, create_license_section, create_copyright_section, create_citation_section) that the plan body didn't enumerate but which are mechanically the same fix. - theorem.rs / proof.rs:extract_name_attr — thread &div.attr_source through; index before kvs.remove("name"); use attr_source.attributes[idx].1. Plan-suggested debug_assert_eq! is too strict (fires on the common AttrSourceInfo::empty() test pattern with non-empty kvs); relaxed to "empty OR equal" so empty AttrSourceInfo signals "no provenance" rather than a bug. - pampa postprocess.rs:1348 synthetic Space — Generated { by: tree_sitter_postprocess() }. All 9448 workspace tests pass.

…h anchor Two new research plans extending the provenance epic past Plan 8. Adopt the `provenance-plan-N-<slug>.md` naming convention (drop the `q2-preview-` prefix); the epic has outgrown the original framing. Plan 9 — ValueSource threading for metadata-derived content (~860 LOC) - Phase 1: `config_value_to_inlines_with_provenance` + `DocumentProfile.title_source_info` + `AppendixSection` enum - Phase 2: meta/var shortcode two-anchor shape (closes bd-129m3) - Phase 3: DocumentProfile.title → nav-text ValueSource (closes bd-8pmq3) - Phase 4: appendix per-section sub-Div ValueSource (option A; `By::appendix(AppendixSection)` typed enum) - Phase 5: Plan-7 deferred invariant tests (preimage_in role-asymmetry; appendix-license e2e round-trip) - Phase 6: Plan 7 §asymmetry wording cleanup Plan 10 — Dispatch anchor + Lua source registration (~1100 LOC) - AnchorRole::Dispatch (diagnostic-only; inherits Plan 9's `AnchorRole::Other` policy) - SourceContext extension for Lua filter / handler files (FileIds + content) - Lua engine bridge: thread FileId into closure context; resolve `debug.getinfo()` to typed source_info - `By::filter` signature shrinks to nullary; path/line move to Dispatch anchor - Lua-handler shortcode gets `from: [Invocation, Dispatch]` - Wire-format migration with dual-reader window - Cache-key extension; coordinates with Plan 7a's filter_sources_hash - Closes bd-36fr9 Both marked as research plans pending API-surface finalization; subsequent review pass converts to development plans with checklisted phases. No code changes.

Adds 12 new tests covering Plan 6's invariants: Per-transform shape tests (one per fixed synthesizer): - sectionize: synthesized Div carries Generated{by:sectionize, from:[]} on both close-paths; wrapped Header keeps its original source_info. - title_block: synthesized Header + inner Str both carry Generated{by:title-block}. - footnotes: synthesized container Div (and its HorizontalRule chrome) carry Generated{by:footnotes}. - appendix: synthesized container Div carries Generated{by:appendix}. Shortcode tests (shortcode_resolve.rs): - shortcode_resolution_has_generated_with_invocation_anchor: resolved Str shape, including by.data.name and Invocation source. - multi_inline_shortcode_resolution_shares_invocation_source: multi- inline + nested Strong[Str] all share the same Invocation source_info. - escaped_shortcode_keeps_original_source_info: token's Original source_info, not Generated. - unknown_shortcode_error_uses_token_source_info: both Strong + inner Str carry the token's Original. - shortcode_resolution_required_anchor_invariant: no Generated{by: shortcode, from: []} survives. - shortcode_resolution_is_deterministic: two runs produce ==-equal ASTs (every Generated.by, every Generated.from[], every Original). Lua enrichment test (lua_integration sub-module, native-only): - lua_shortcode_typed_return_enriched_to_shortcode_kind: a typed pandoc.Str return that started life as Generated{by:filter, ...} via filter_source_info gets promoted to Generated{by:shortcode, data: {name, lua_path, lua_line}, from:[Invocation]}. Deferred tests documented in the plan checklist (attribution, pipeline- level audit, include composition, writer round-trip — owned by Plans 7/8 or the e2e test crate). All 9460 workspace tests pass.

…pl checklist Thorough rewrite incorporating all decisions from the Plan 7 review session. Removes inconsistencies left by multiple revisions; presents the plan as a single coherent design. Adds an 83-item implementation checklist across 9 phases. Key changes from the previous version: API decomposition - Writer is pipeline-agnostic; caller supplies baseline AST. - `incremental_write_qmd` signature becomes `(original_qmd, baseline_ast_json, new_ast_json) → { qmd, warnings }`. - No `pipeline_kind` parameter. Pipeline tier is implicit in whichever baseline AST the caller passes. - Decomposition framed as parse / transform / reconcile / write primitives; writer is just the last one. Coarsen pseudo-code fixes - KeepBefore catch-all falls through to Rewrite, not Omit (data-loss-shaped fallback removed). - Inline-level soft-drop substitutes via `before_idx` from the alignment, not anchor-matching (which would fail for user-edit inlines that lack Invocation anchors). - Multi-inline dedupe equality criterion stated as PartialEq on the Invocation anchor's source_info. Unified editability predicate - `is_editable_inside(node, target)` consulted by Plan 2A (React-side read-only check) AND the writer's coarsen (soft-drop logic). Three reasons content is uneditable: atomic CustomNode, atomic-kind Generated, or no preimage in target file. - Non-atomic synthesized containers (sectionize, footnotes, appendix) gain read-only treatment via the no-preimage clause. Soft-drop catalog expansion - Adds two new soft-drop cases for no-preimage Generated containers: RecurseIntoContainer and UseAfter both substitute Omit + Q-3-43 warning. Let-user-win stays for atomic CustomNodes only (they have menu-driven replacement affordance + preimage). - Q-3-43 widens to "Generated content edit dropped" — three emission paths (include, metadata-derived RecurseIntoContainer, metadata-derived UseAfter), one diagnostic code, structured body that names the source. Migration plan (new section) - WASM signature, TS wrapper, sync-client interface, three consumer sites enumerated with before/after. - q2-demos require no state changes: sync-client's `astCache` already maintains both `source` and `ast`; baseline supplied from `cached.ast`. - All in one PR; no back-compat shim (3 in-repo consumers). SPA integration - Content-match echo-prevention specified: hash emitted qmd, compare in onFileContent, suppress local echoes without losing unrelated file updates. - `pipelineKindForFormat` moves from hub-client to `ts-packages/preview-runtime/src/pipelineKind.ts` (the SPA needs it for the display path). Tests deferred to Plan 9 - `preimage_in` role-asymmetry end-to-end test and appendix-license round-trip test land in Plan 9 Phase 5 (need a real ValueSource consumer). - Plan 7 keeps unit-level `preimage_in` tests that don't depend on Plan 9's stamping. `AnchorRole::Other` policy stated explicitly: future anchor roles default to non-walked by the writer. Inherited by Plans 9 and 10. `AtomicViolation` removed entirely (soft-drop replaces it). Stale line references corrected (`block_source_span` 447→448; `inline_source_span` named at 800). Earlier-draft historical framing edited out; the plan now reads as a single coherent design. Implementation checklist - Phase 1: foundation primitives (quarto-source-map, quarto-core) - Phase 2: writer internals (pampa::writers::incremental) - Phase 3: diagnostic catalog (Q-3-42, Q-3-43) - Phase 4: WASM bridge signature change - Phase 5: TypeScript wrapper + sync-client interface - Phase 6: consumer migrations (ReactPreview + 2 demos) - Phase 7: q2-preview SPA integration - Phase 8: end-to-end tests - Phase 9: verification + cleanup Estimated scope: ~1390 LOC (up from 1310; consumer migration and SPA echo-prevention added a bit despite pipeline-plumbing removal). No code changes.

- cargo xtask verify: all 12 steps green (build, tests, lint, fmt, WASM, hub-client, q2-preview-spa). 9460 workspace tests pass. - End-to-end exercise: target/debug/q2 render on a fixture exercising title-block / sectionize / footnotes / appendix / meta shortcode. Generated HTML inspected — synthesizers and shortcode resolver produce correct output. Recorded in the plan body. - crates/wasm-quarto-hub-client/Cargo.lock picks up the new smallvec dep on quarto-core (transitive through the WASM build).

…rch plans Plan 7 rewrite incorporates all decisions from the review session: - API decomposition: writer is pipeline-agnostic; caller supplies baseline AST. `incremental_write_qmd` signature becomes `(original_qmd, baseline_ast_json, new_ast_json) → { qmd, warnings }`. No `pipeline_kind` parameter. - Unified `is_editable_inside` predicate consulted by both Plan 2A (React-side read-only check) and the writer's coarsen (soft-drop). Non-atomic synthesized containers gain read-only treatment via the no-preimage clause. - Soft-drop catalog expanded: no-preimage Generated containers soft-drop on both RecurseIntoContainer and UseAfter; Q-3-43 widens to "Generated content edit dropped" with three emission paths. - Coarsen KeepBefore catch-all falls to Rewrite (not Omit) — data-loss-shaped fallback removed. - Multi-inline dedupe via PartialEq on Invocation anchor source_info. - Inline-level soft-drop substitutes via `before_idx` from the alignment (not anchor-matching, which would fail for user-edit inlines). - AtomicViolation variant removed entirely. - SPA integration: content-match echo-prevention; `DiagnosticStrip` component; `pipelineKindForFormat` moves to ts-packages/preview-runtime. - Migration plan covers all 3 in-repo consumers (ReactPreview, kanban, hub-react-todo) + sync-client interface. q2-demos require no state changes — sync-client's astCache already holds the baseline. - 83-item implementation checklist across 9 phases. Plans 9 and 10 (new research plans, `provenance-plan-N-<slug>.md` naming convention): - Plan 9 — ValueSource threading for metadata-derived content; closes bd-129m3, bd-8pmq3, and the unowned appendix-license obligation. Owns Plan 7's deferred role-asymmetry e2e test. - Plan 10 — Dispatch anchor + Lua source registration; closes bd-36fr9. Inherits Plan 9's `AnchorRole::Other` policy. `By::filter` signature shrinks (path/line move to Dispatch anchor). Both Plans 9 and 10 marked as research plans; API surface settled, implementation order to be pinned in subsequent review pass. No code changes.

Five refinements surfaced in the post-merge "any further thoughts" review and now folded back into the plans. Plan 7 (q2-preview-plan-7-incremental-writer.md): - **Scope clarification — first-demo UX**: lifting the coarse read-only guard exposes the writer's soft-drop warnings as the primary safeguard. A fine-grained React-side editability gate (per-region greying via a TS `is_editable_inside`-equivalent) is deferred to a future frontend pass. For the first demo, "you can type, but it doesn't take, and you see a warning" is the deliverable. Plan 2A's existing atomic-CustomNode gate continues to prevent the most surprising cases without further work. - **Q-3-43 catalog mechanics verified**: catalog entries carry one static `message_template`; per-call-site body text uses the existing `DiagnosticMessageBuilder` builder pattern. No template-able-body infrastructure needed. Phase 3 ships one catalog entry per code + three builder helper functions. - **Phase 9 `q2 preview` WASM rebuild chain**: explicit three-step refresh (`npm run build:wasm` → `cargo xtask build-q2-preview-spa` → `cargo build --bin q2`) added as sub-bullets. References the 2026-05-20 stale-WASM incident and the canonical instructions in `CLAUDE.md` §"Verifying Rust changes in `q2 preview`". - **Coordination posture**: the 83-item checklist is sized for serial implementation in a single fresh 1M-context session. No beads-per-phase split needed; follow-ups only for surprises. Plan 10 (provenance-plan-10-dispatch-anchor.md): - **Wire format: clean break, not dual reader**. Per "emphasis on clean design" guidance — the codebase is workspace-internal Rust with no on-disk artifacts holding the old shape. Phase 6 becomes a one-PR break (writers emit the new shape; old shape removed entirely). Same rationale as `By::appendix` (Plan 9) and `By::filter` (Phase 4 above). Removed the related open question about legacy-shape-decoder behavior. Estimated scope for Phase 6 drops from ~150 to ~80 LOC. - **Phase 7 reuses Plan 7a's `filter_sources_hash`**: Plan 7a lands first per user direction. Plan 10's Phase 7 reduces to a smoke test confirming the existing hash field invalidates on Lua filter file changes. Estimated scope drops from ~80 to ~30 LOC. Plan-7a-vs-Plan-10 coordination friction is now resolved (in the risk-areas section). - **Total Plan 10 estimate drops from ~1100 to ~980 LOC** as a result of these two simplifications. Plan 7a (q2-preview-plan-7a-user-filter-idempotence.md): - **Plan 10 cross-reference refreshed**: the "Plan 4 / Plan 6's Dispatch follow-up" reference becomes "Plan 10 (`claude-notes/plans/2026-05-22-provenance-plan-10-dispatch- anchor.md`)". Acknowledges Plan 10 now exists as a numbered plan. - **Coordination note added**: `filter_sources_hash` is Plan 7a's field; Plan 10 reuses it. Migration to Dispatch-anchor-based per-Lua-line attribution is purely additive when Plan 10 lands. - **Structural independence noted**: Plan 7a reads filter path from `FilterMetadata.spec`, not from any Generated node's `by.data`, so it's structurally independent of `By`'s data shape. Plan 10's clean break on `By::filter` doesn't affect Plan 7a's diagnostic emission. No code changes.

…y, editability gate) Phase 1 lands the writer-side primitives Plan 7's coarsen will consume. All purely additive: no existing API changes; no Generated nodes produced anywhere yet — coarsen still emits the pre-Plan-7 shape. `SourceInfo::preimage_in(target: FileId) -> Option<Range<usize>>` - Original: Some(range) iff file matches target. - Substring: recurse parent; offsets compose additively. - Concat: every piece must resolve into target AND be byte-contiguous; gappy or mixed-file Concats return None. - Generated: walk Invocation anchor only via invocation_anchor(). ValueSource (Plan 9), future Dispatch (Plan 10), and Other roles are diagnostic-only — never consulted by the writer's byte-copying path. Documented on both preimage_in and AnchorRole::Other doc-comments. Atomic CustomNode registry — moved to `quarto-pandoc-types` - Plan 7 originally placed `ATOMIC_CUSTOM_NODES` / `is_atomic_custom_node` in `quarto-core`, but `quarto-core` depends on `pampa` and the writer is the consumer — that direction would cycle. Moved down to `quarto-pandoc-types` (the home of `CustomNode` itself). - Lockstep cross-check test in `quarto-core::crossref` pins `CROSSREF_RESOLVED_REF` against the registry literal. Editability gate — `pampa::writers::incremental` - `is_editable_inside_block(block, target_file_id) -> bool` - `is_editable_inside_inline(inline, target_file_id) -> bool` - Shared private `is_editable_inside_source_info` core. - Three uneditable reasons: atomic CustomNode, atomic-kind Generated, no-preimage Generated (covers ValueSource-only Generated as a consequence of preimage_in's Invocation-only walk). Reconciler source-info-blindness — extended `quarto-ast-reconcile` - Five new tests covering the Plan-4/6 Generated shapes: Generated-with-different-By, Generated-with-different-anchor-lists, mixed Invocation/ValueSource, CustomNode wrapper Generated-vs-Original, CustomNode slot-child Generated-vs-Original. Existing Original-only blindness tests already covered the pre-Plan-6 cases. Tests added (all green): - 16 preimage_in tests in quarto-source-map - 3 atomic-registry tests in quarto-pandoc-types + 1 lockstep in quarto-core - 12 editability tests in pampa::writers::incremental - 5 reconciler-blindness tests in quarto-ast-reconcile Verification: `cargo nextest run --workspace` (9509 tests) and `cargo xtask verify` (full 12-step chain including WASM build + hub-client tests) both green. Plan refs: - claude-notes/plans/2026-05-04-q2-preview-plan-7-incremental-writer.md §"`preimage_in` semantics", §"Unified editability predicate", §"`is_atomic_custom_node` registry", Phase 1 checklist. Snapshot test changes: none.

…lti-inline dedupe Coarsen's classification rewires per Plan 7's cascade and gains soft-drop substitutions. Bad edits no longer abort the write — they substitute a safe alignment and push a Q-3-42 / Q-3-43 warning. CoarsenedEntry variants (Phase 2a) - `Transparent { child_entries }` — non-atomic Generated wrappers with source-bearing children. Wrapper contributes nothing; children emit through a recursive `emit_entries` helper that shares `prev_entry` state across the wrapper boundary so separators compose naturally. - `Omit` — atomic-kind Generated with no Invocation anchor (filter constructions, title-block synthesis, tree-sitter postprocess space), and soft-drop substitutions for no-preimage Generated containers. - `Verbatim::orig_idx` / `InlineSplice::orig_idx` widened to `Option<usize>` so Transparent's children opt out of the original-gap separator optimization (they aren't top-level blocks). Signature changes (Phase 2b) - `incremental_write` and `compute_incremental_edits` return `Result<(String, Vec<DiagnosticMessage>), Vec<DiagnosticMessage>>`. Warnings on Ok (soft-drop); Err reserved for genuine structural failures from `assemble_inline_splice`. - `coarsen` accepts `&mut Vec<DiagnosticMessage>` warning sink. - WASM bridge `incremental_write_qmd` threads warnings into the existing `AstResponse.warnings` channel via `diagnostics_to_json`. KeepBefore cascade (Phase 2c) - `preimage_in(target)` present → Verbatim (covers Original, Substring, contiguous Concat, Generated-via-Invocation). - Atomic-kind Generated with no Invocation → Omit + debug_assert against shortcode-with-empty-from (Plan 6 stamper invariant). - Non-atomic Generated with source-bearing children → Transparent recurse over the wrapper's children. - Catch-all → Rewrite. Cross-file Original, gappy Concat, Generated wrapper without source-bearing children. UseAfter + RecurseIntoContainer soft-drops (Phase 2d) - UseAfter on atomic CustomNode: let-user-win Rewrite (no warning) — the qmd writer's CustomNode arm reads `plain_data`. - UseAfter on no-preimage Generated: Omit + Q-3-43 — no source position to anchor a Rewrite at; original container regenerates next run. - RecurseIntoContainer with `!is_editable_inside_block` → Verbatim wrapper bytes (if preimage) or Omit (no preimage) + Q-3-43. Inline-level soft-drop + multi-inline dedupe (Phase 2e) - Two-phase `assemble_inline_content`: first applies soft-drop substitutions (UseAfter / RecurseIntoContainer targeting a non-editable original inline → KeepBefore at the positional proxy + Q-3-42); second emits, with consecutive KeepBefore entries whose Invocation anchors are PartialEq-equal collapsing to a single emission of the anchor's preimage bytes. - Singleton-KeepBefore inline emit path updated to use `preimage_in(target_file_id)` (with `inline_source_span` fallback). Original-SI inlines are byte-identical to the old behavior; Generated-SI inlines now emit the Invocation anchor's preimage instead of an empty range — fixes a latent zero-length bug in the pre-Plan-7 inline-splice path. Diagnostic catalog (Phase 3a) - `Q-3-42` "Shortcode edit dropped" — inline-level soft-drop on atomic-Generated content. Source location: the Invocation anchor's source_info (the token bytes). - `Q-3-43` "Generated content edit dropped" — block-level soft-drop. Single catalog `message_template`; the three emission paths supply distinct body text via the builder API (per Plan 7 §"Catalog mechanics"). - Builder helpers `diagnostic_q3_42_inline(inline)` and `diagnostic_q3_43_block(block)` live in `pampa::writers::incremental` (not `quarto-error-reporting`, which doesn't depend on `quarto-pandoc-types`). Tests added - 12 Plan-7-specific coarsen unit tests in `coarsen_plan7_tests`: KeepBefore cascade variants (Verbatim, Omit, Transparent, Rewrite catch-all), UseAfter let-user-win + soft-drop, RecurseIntoContainer soft-drop variants, multi-inline dedupe positive / negative / ValueSource cross-talk, inline UseAfter Q-3-42 path. Test-caller migrations - All `.expect("incremental_write failed")` and `.unwrap()` callsites in `inline_splice_*` and `incremental_writer_tests.rs` updated to destructure the new `(qmd, warnings)` tuple. Verification - `cargo nextest run --workspace`: 9535 passed. - `cargo xtask verify`: all 12 steps green (lint, fmt, build, Rust tests, hub-client WASM build + tests, trace-viewer, q2-preview-spa, shared packages). Deferred follow-ups - Writer-lossless baseline test for each Plan-6/Plan-7 Generated shape (needs crafted fixtures; Plan 7 checklist item). - Soft-drop interaction test (shortcode + non-atomic edit in same Para). - Filter-construction soft-drop-on-edit test. Snapshot test changes: none. Plan refs: - claude-notes/plans/2026-05-04-q2-preview-plan-7-incremental-writer.md §"Coarsen pseudo-code", §"Inline-level soft-drop", §"Multi-inline dedupe", §"Diagnostic codes", §"Catalog mechanics".

Repo-level facts that bit during Phase 1-3 implementation, surfaced in the Phase 4 handoff prompt and folded back into the plan so future readers don't re-discover them: - Phase 4 §Repo facts: wasm-quarto-hub-client is NOT in the cargo workspace; build via `cd hub-client && npm run build:wasm` or via `cargo xtask verify` step 6. `AstResponse.warnings` is `Option<Vec<JsonDiagnostic>>`; convert via `diagnostics_to_json` taking `&SourceContext` — access `ASTContext.source_context`. - Phase 2 §Repo facts for test fixtures: `AttrSourceInfo` doesn't implement `Default` (use `AttrSourceInfo::empty()`). `gen` is a Rust 2024 reserved keyword; don't name a `SourceInfo::Generated` variable `gen`. These are workspace-internal facts that survive context boundaries and help any future implementer in this codebase.

The hub-client-e2e.yml `paths:` filter only fires the workflow when a commit touches `hub-client/**` or the workflow file itself. It does not follow transitive Rust deps, so PRs that modify upstream crates the WASM bundle depends on — `quarto-core`, `quarto-pandoc-types`, `quarto-source-map`, `pampa`, `quarto-ast-reconcile`, `wasm-quarto-hub-client`, etc. — silently skip e2e. Two recent misses: - f96f56d (Carlos, 5/22): WASM-incompatible `Instant::now()` and `pollster::block_on` introduced in `quarto-core` broke 8 hub-client WASM tests on main. e2e never ran because the change was under `crates/`, not `hub-client/`. - PR #231 (feature/provenance, this branch): 57 files modified across `crates/` and `ts-packages/`, zero under `hub-client/`. e2e silently skipped on every push despite the PR materially changing the WASM bundle's behavior. Fix: drop the `paths:` filter outright and match the trigger shape of the sibling heavy workflows (`test-suite.yml`, `ts-test-suite.yml`). Also adds a `concurrency:` block (lifted from `test-suite.yml`) so superseded runs on a PR get cancelled in flight — keeps the runner cost from compounding. Closes bd-izh3. The original ask there was to add a PR trigger with a *broader* path filter; that approach still wouldn't catch the upstream- crate case, so we go the coarser route the issue's spirit calls for. The runner-sizing open question in bd-izh3 is also resolved — ae8274a confirmed `ubuntu-latest` (2 cores, 2 Playwright workers) handles the full suite in 5.3-8.1 min. `kyoto` deliberately omitted from the branch list: `origin/kyoto` last moved 2026-02-02 and is 825 commits behind main; the sibling workflows still reference it but that's cargo-cult.

Phase 4 — WASM bridge (`incremental_write_qmd`): - Signature now `(original_qmd, baseline_ast_json, new_ast_json)`. - Internal `qmd_to_pandoc` re-parse removed; baseline is deserialized via `pampa::readers::json::read`, preserving any host-side provenance the caller attached (e.g. `preimage_in` from a prior incremental edit). - `AstResponse.warnings` populated via `diagnostics_to_json(&warnings, &baseline_context.source_context)`. Phase 5 — TS wrappers + sync-client: - `wasmRenderer.ts` `incrementalWriteQmd` returns `{ qmd, warnings? }`; baseline accepted as parsed AST or pre-serialized JSON string. - `.d.ts` files updated (preview-runtime + hub-client); add `warnings?` to `AstResponse`. - `quarto-sync-client` `astOptions.incrementalWriteQmd` widened; `client.ts` passes `cached.ast` as baseline and reads `result.qmd`. - `pipelineKind.ts` (+ test) moved to `ts-packages/preview-runtime/` and re-exported; `ReactPreview.tsx` imports it from the package. SPA does not import it yet (Phase 7). Phase 6 — Consumers: - `ReactPreview.handleSetAst`: read-only guard removed; passes the displayed `ast` state as baseline; soft-drop warnings (Q-3-42 / Q-3-43) are stashed in a ref and merged into the next `onDiagnosticsChange` push so they surface in the existing diagnostics panel. - Both demos (`kanban`, `hub-react-todo`): `wasm.ts`, `useSyncedAst.ts`, and the local `.d.ts` shim updated to the new signature/return shape. Verification: `cargo xtask verify` green — all 12 steps including WASM rebuild and hub-client tests; 9535 Rust tests pass. Plan 7 (`claude-notes/plans/2026-05-04-q2-preview-plan-7-incremental-writer.md`) phases 4-6 checkboxes flipped to done.

…trip `PreviewApp.tsx`: - `noopSetAst` replaced with `handleSetAst` that calls `incrementalWriteQmd(originalQmd, baselineJson, newAst)` using the active page's qmd + the current `astJson` as the baseline. Stable callback identity via `activeFileRef` / `astJsonRef` so the iframe's postMessage listener doesn't re-bind on every render. - Content-match echo-prevention: hash the emitted qmd with FNV-1a, stash `(path, hash)` in `lastEmittedRef`, and the next `onFileContent` for that path whose content hashes equal is silently dropped (consumes the ref). Avoids the SPA re-rendering off its own write and racing follow-up edits. Hash-algorithm rationale is in the `fnv1aHex` docstring. - Soft-drop warnings (Q-3-42 / Q-3-43) accumulate into `writeWarnings` state and surface via `<DiagnosticStrip>`. `components/DiagnosticStrip.tsx` (new): - Inline-styled fixed strip in the bottom-right of the preview pane; matches the existing component convention (no separate CSS file). - `suppressAfterThree` helper caps each `(code, source-range)` group at 3 entries per Plan 7 §"Autosave-context spam mitigation" so every-keystroke renders don't flood the surface. - Catalog title + problem text are rendered verbatim — Phase 3's catalog entries are already imperative ("edit the invocation token in source instead"). Verification: `cargo xtask verify --skip-rust-tests` green — all 12 steps including hub-client build/test and q2-preview-spa build. Plan 7 Phase 7 checkboxes flipped to done.

Phase 8 subset (landed now): - `hub-client/src/services/incrementalWrite.wasm.test.ts` — vitest test against the real WASM bridge, pinning the 3-arg signature (`original_qmd, baseline_ast_json, new_ast_json`), the `{ qmd, warnings? }` return shape, identity round-trip byte-equality, paragraph-edit text propagation with surrounding structure preserved, and structured-error reporting on malformed baseline JSON. 3/3 passing under `npm run test:wasm`. - Plan 3's idempotence test re-run as part of `cargo xtask verify` (9535/9535 Rust tests). Phase 9 (verification): - WASM chain refreshed end-to-end: `npm run build:wasm` → `cargo xtask build-q2-preview-spa` → `cargo build --bin q2` - `q2 preview /tmp/plan7-smoke` boot smoke: server came up, SPA rendered the fixture, confirmed in the user's browser (session 2026-05-24). - Full `cargo xtask verify` already green from Phase 4-6 / 7 commits. Deferred to `bd-3izo3`: - The Playwright e2e scenario matrix (sectionized round-trip, single & multi-inline shortcode preservation, Q-3-42 byte-equal-no-op, Q-3-43 footnotes regeneration, SPA edit-paragraph in project + single-file modes, SPA DiagnosticStrip on shortcode edit, mixed atomic + non-atomic, content-match echo-prevention fixture). - The Rust-side soft-drop matrix is already covered in `crates/pampa/src/writers/incremental.rs`; the deferred work is end-to-end *delivery* coverage, not new correctness coverage.

Plan 9's Phase 5 (deferred Plan-7 invariant tests) gets a status note recording that Plan 7 landed on `feature/provenance` 2026-05-24, so when Plan 9 picks up, the dependency is visibly unblocked. Plan 7 closes its corresponding Phase-9 checkbox — the test reference is already in Plan 9 Phase 5, both plans cross-link correctly.

Plan 7's session left four code-side test items deferred (three Rust unit tests where the original plan author hedged about fixture construction, one Playwright e2e scenario matrix scoped out for context budget). On reassessment all four are mechanical follow-throughs, not research. Plan 7b consolidates them into one deliberate test pass with three phases: 1. Rust unit tests (writer-lossless baseline, soft-drop interaction, filter-construction UseAfter) 2. Hub-client Playwright specs (5 scenarios) 3. SPA Playwright specs (5 scenarios) 4. Cleanup (close bd-3izo3, flip Plan 7's deferred checkboxes) Intended to run before Plan 7a so the writer's round-trip contract is fully pinned before runtime idempotence detection layers on top. Plan 7 gets forward pointers from each deferred checkbox to its new home in Plan 7b.

Adds claude-notes/designs/provenance-contract.md as the reference doc for transform authors emitting SourceInfo. Captures the Plan-6 conventions that survived implementation: four-branch decision tree for picking Original / Generated{from:[]} / Generated{from:[Invocation]} / leave-alone, the By:: constructor catalog with atomicity flags, the enrichment-via-post-walk pattern (kind promotion, by.data migration, anchor appending) with stamp_shortcode_anchors as the reference implementation, the AttrSourceInfo positional-alignment threading recipe with the relaxed debug_assert! footgun called out, atomic-kind consumer impact, required-anchor invariants, the call-site-threading outlier pattern (make_error_inline / shortcode_to_literal), and a do-not list. Mirrors the structure and tone of claude-notes/designs/document-profile-contract.md. Links out to Plans 4-8 and the Plan-6 audit report rather than re-explaining their content; names follow-up beads (bd-129m3 ValueSource, bd-36fr9 Dispatch, bd-12vrr callout, bd-1inj0 codeblock chrome, bd-3aolj / bd-1e6a5 parser alignment) without designing them.

Plan 7c closes four correctness/coverage items where the post-review Plan-7 doc and the actual landed implementation diverged because the implementation agent ran before the review-pass merge: 1. Q-3-41 "Edit dropped — render not ready yet" — catalog entry + TS-side emission from both ReactPreview and the SPA so the first-edit-before-render case no longer drops silently. 2. TS-side hasPreimageIn + isEditableInside — the predicate pair that closes Plan 2A's framework gate so edits that the writer would soft-drop are intercepted at the DOM. The atomicity-only gate at framework/dispatch.tsx:404-411 is updated to consult the unified editability predicate; preimage_in walks Invocation only, per the writer contract. 3. cfg(debug_assertions) #[should_panic] test for the shortcode- Generated-with-empty-from debug-assert at incremental.rs:448. 4. Per-kind soft-drop test symmetry: explicit Omit + inline-UseAfter tests for filter / title-block / tree-sitter-postprocess kinds; multi-inline dedupe filter case. Six phases, ~530 LOC. No new design surface; the plan is disjoint from Plan 7b's test-o-rama scope and reuses existing diagnostic / context infrastructure throughout.

…ppers in RecurseIntoContainer Editing inside a `q2 preview` document silently failed with `Incremental write failed: undefined` whenever the post-render pipeline wrapped the whole document in a top-level `Generated{by: sectionize}` Div. The reconciler aligned 1 Div : 1 Div as RecurseIntoContainer; `coarsen`'s Plan 7 soft-drop guard then fired on the wrapper (`is_editable_inside_block` is false for a Generated with no preimage) and emitted `Omit` for the entire document, producing an empty qmd + one Q-3-43 warning. Mirror the existing Transparent path that `coarsen_keep_before_block` uses for unchanged wrappers: when the orig container is a non-atomic Generated with source-bearing children AND `block_container_plans` has a nested plan for this index, recurse `coarsen_blocks` on the children and wrap in `Transparent`. Children carry `orig_idx: None` (children-relative, not top-level) via a new normalizer. Refactor: split `coarsen` into a thin Pandoc-aware entry that derives `target_file_id` and a slice-based `coarsen_blocks` that the new `coarsen_children` reuses. Existing soft-drop tests (no-recursable-children case) still hit the last-resort `Omit` + Q-3-43 path. Tests: - `crates/pampa/tests/incremental_writer_tests.rs`: `sectionize_wrapper_with_inner_para_edit_produces_nonempty_output` - `hub-client/e2e/q2-preview-render-components-write.spec.ts` drives the real user flow (q2 preview + comment.tsx +react picker) and is the deterministic browser repro the diagnosis was built on. Bonus: `ts-packages/preview-runtime/src/wasmRenderer.ts` throw site now distinguishes empty-qmd from real-error and surfaces the warning count instead of literal "undefined".

…_file_id (Plan 7c Phase 8) Closes Plan 7c Phase 8. `coarsen` derived `target_file_id` from `original_ast.blocks.first().and_then(|b| b.source_info().root_file_id())`, falling back to `FileId(0)` when the first block was a synthesized container (title-block, sectionize wrapper, footnotes / appendix container) — all of which carry `SourceInfo::Generated` with no `Invocation` anchor, so `root_file_id()` returns `None`. On single-file fixtures the qmd happens to live at `FileId(0)` and the fallback was coincidentally correct, masking the bug. Replace the call with a `derive_target_file_id` helper that walks `blocks` depth-first, descending through `block_block_children` (Div / BlockQuote / Figure / NoteDefinitionFencedBlock), returning the first `Some(root_file_id())` it sees. Descent matters for the sole-top-level-sectionize-wrapper shape too — without it, `preimage_in(FileId(0))` would return `None` for every real block inside the wrapper, all editability checks would fail, and the RecurseIntoContainer path would soft-drop the user's edit with a Q-3-43 even when the qmd is genuinely a single file. Tests: - `target_file_id_skips_synthesized_first_block` — synthesized title-block at `blocks[0]`, real Para at `blocks[1]` carrying `FileId(7)`. Pre-fix: identity reconcile with an inline edit on the real Para fires Q-3-43 (Para is gated non-editable because `preimage_in(FileId(0))` on a `FileId(7)`-Original returns `None`). Post-fix: no warning. - `target_file_id_defaults_to_zero_for_empty_document` — pins the `FileId(0)` fallback for the genuinely-empty AST.

…[0] is a synthesized wrapper `emit_metadata_prefix` read `original_ast.blocks[0].source_info().start_offset()` to locate the boundary between the YAML frontmatter region and the first user block. For the post-q2-preview-pipeline AST whose `blocks[0]` is a synthesized sectionize Div (Generated, no Invocation anchor), that offset is 0, so the function concluded "no metadata region" and silently deleted the entire frontmatter from the output. The fix in bdcfdc5 (Transparent recursion into the wrapper for `RecurseIntoContainer`) unmasked this second-order bug: edits now round-trip, but the frontmatter vanishes. Add a `first_target_anchored_start_in` helper that walks `blocks` depth-first, descending through `block_block_children` for blocks that have no preimage of their own, returning the first start offset that DOES have preimage in the target file. Sole-block sectionize wrappers (sectionize, footnotes, appendix) yield their children's first real start instead of `0`. Use it from `emit_metadata_prefix`, paired with the `derive_target_file_id` helper from b9f64b5 so a non-FileId(0) qmd is handled too. Regression test: `sectionize_wrapper_preserves_frontmatter_after_inner_edit` — wraps the user-reported repro shape (frontmatter + sectionize-wrapped Header + Para with an EditComment append) and asserts both the frontmatter and the spliced reaction land in the output.

…tern + cross-link plans The three sectionize-wrapper bugs of 2026-05-25 (`bdcfdc53` / `b9f64b56` / `2bf92664`) were three rediscoveries of the same fact: code that asks `original_ast.blocks[0]` for source-position information assumes a flat AST, but the post-q2-preview-pipeline AST wraps everything in a top-level synthesized container. Each fix grew its own ad-hoc descent helper. This commit names the pattern (*transparent wrapper*) and lifts the descent into one reusable walker, so the next caller doesn't have to rediscover it. Code changes: - New `first_in_user_tree<T>(blocks, extract)` — walks blocks depth-first, descending through `block_block_children` when `extract` returns `None`. This is the descent primitive both earlier helpers were re-implementing. - New `is_transparent_wrapper(block, target)` predicate — structurally checkable per-block (Generated, no Invocation anchor, block-container shape, has source-bearing descendants). No registration / opt-in: a Lua filter that wraps user content in a Div with the right shape is automatically transparent. - `derive_target_file_id` and `first_target_anchored_start_in` reduce to one-liners on top of `first_in_user_tree`. Net code decrease. Design doc `claude-notes/designs/transparent-wrappers.md` pins the contract: the three structural conditions, the known synthesizers, the "where to use which" table, the "where the code lives + when to promote it" rule, an anti-pattern catalog, and a history table for the three originating bugs. Plan cross-references (annotation only, no scope change): - Plan 9 (`title_source_info`) — invariant note: extractor runs pre-sugar today; if moved past sectionize or extended with a "first H1" fallback, must use `first_in_user_tree`. - Plan 8 (IncludeExpansion) — note that the wrapper's `Original` source_info is what keeps it from being a transparent wrapper; debug-assert recommended if a future variant emits `Generated`. - Plan 10 (Dispatch / Lua filters) — new sub-section "Lua filters that wrap user content" describes the implicit editing contract (emit the right shape, the visual editor sees through it; no registration needed). - Plan 7a (filter idempotence) — Q-3-44 hint can detect walked-into-the-wrapper authoring errors via `is_transparent_wrapper(blocks[0])`. - Plan 7b (test-o-rama) — gap noted: existing writer-lossless fixtures assume a flat AST; add a sectionize-wrapper-at-top variant. - Project-replay engine — annotation that the flat walk is only safe because the splice runs pre-sugar. - Plan 7c — reference link to the new design doc. - `provenance-contract.md` §8 — sibling cross-link to the new doc; producer-side catalog of wrapper kinds is here, consumer- side descent rule is there. Tests: all 1570 pampa tests still pass; the Playwright e2e (the deterministic browser repro for the first bug) still passes against the rebuilt WASM.

…when the rewrite produces byte-identical output The Plan 7 soft-drop diagnostic surface (Q-3-42 / Q-3-43 warnings from the incremental writer) relied on one delivery path: `handleSetAst` stored warnings in `pendingWriteWarningsRef`, then the next render's `doRenderWithStateManagement` drained the ref into its merged diagnostics push. That path silently dropped warnings in the common soft-drop case. When the writer faithfully preserves the original bytes (the correct behaviour for an edit it had to reject — e.g. typing inside a `{{< lipsum 3 >}}` resolution), `incrementalWriteQmd` returns warnings AND byte-identical output. `handleContentRewrite` in `useAutomergeSync.ts` then computes `diffToMonacoEdits(old, new)`, gets an empty edit list, and skips `executeEdits`. Monaco's `onChange` never fires; automerge doesn't update; no re-render happens; `pendingWriteWarningsRef.current` stays full forever. The user sees no signal that their edit was declined. Add an *immediate* push path alongside the ride-along: 1. `handleSetAst`, on warnings, now calls `onDiagnosticsChange` directly — merged with the most recent render-side diagnostics (tracked in a new `lastRenderDiagnosticsRef`, updated wherever `doRenderWithStateManagement` calls `onDiagnosticsChange`). 2. The existing `pendingWriteWarningsRef` ride-along stays as a safety net for the rare case where a re-render *does* fire after a write that produced warnings (typically when the writer chose Rewrite over soft-drop). It's a no-op for the byte-identical path. After this fix, clicking +react on a paragraph inside `{{< lipsum 3 >}}` produces a visible Q-3-43 in the diagnostic panel: "Generated content edit dropped — This content has no editable source position in this file; edit its upstream definition (an include, a metadata key, or other source) instead." Verified by manual repro; the existing e2e (the deterministic browser test for the earlier empty-qmd bug) is unaffected.

…ed + close atomic-Generated UseAfter soft-drop gap The architectural change: every CoarsenedEntry variant now carries everything it needs to produce its bytes. Rewrite previously held `new_idx: usize` — an index into new_ast.blocks that was only correct at the top level. When coarsen_blocks ran inside the Transparent recursion added in bdcfdc5 for the changed-wrapper case, the index pointed at a child-relative position while emit_entries looked it up against new_ast.blocks top-level. The result was an index-out-of-bounds panic at incremental.rs:890 on +react edits inside shortcode-resolved content when the framework's atomic-aware NOOP gate was bypassed for UX testing ("the len is 1 but the index is N"). Rewrite now carries `block_text: String`, pre-computed at coarsen time via the same write_block_to_string call emit_entries used to make. The call is referentially transparent (verified: no global state in writers/qmd.rs; fresh QmdWriterContext per invocation; no I/O, no clock), so this is a shape change, not a semantics change — every existing test stays byte-identical. The shape matches InlineSplice, which has carried block_text since ab10f37. All four producer sites updated; coarsen_keep_before_block became fallible (`Result<CoarsenedEntry, Vec<DiagnosticMessage>>`) since it now calls the writer; both call sites use `?`. A second bug, masked by the panic, surfaced during Phase 3 verification when the user tested the lipsum +react flow with the new fix in place. The BlockAlignment::UseAfter arm filtered atomic-CustomNode and no-preimage-Generated but had no branch for atomic-Generated-WITH-preimage. When the reconciler split the inline edit on an atomic-shortcode paragraph into KeepBefore (Header) + UseAfter (new lipsum) — implicit deletion of the original — the new block's source_info still carried the token's Invocation anchor, but UseAfter fell through to let-user-win Rewrite. With the architectural fix that no longer panics, it instead silently wrote the resolved bytes (the resolved Lorem ipsum + the user's reactji) back into the source qmd, poisoning the user's source. The new UseAfter branch detects atomic-Generated with preimage and emits Verbatim of the token range + Q-3-43, mirroring the soft-drop cascade already in RecurseIntoContainer. The general pattern: when an entry's *new* block looks like an attempt to edit content the user can't actually edit, refuse the edit at the writer regardless of what the reconciler's alignment said. Tests pin both shapes: - sectionize_wrapper_with_shortcode_child_edit_does_not_panic — the architectural Rewrite shape, asserts no panic. - sectionize_wrapper_shortcode_child_edit_soft_drops — the UseAfter soft-drop, asserts on output bytes (token preserved, reactji NOT emitted) and Q-3-43 fired. Verification: cargo nextest -p pampa (3902/3902); cargo xtask verify --skip-hub-build --skip-hub-tests (9656/9656 Rust); hub-client npm run build:wasm + VITE_E2E=1 build; Playwright q2-preview-render-components-write (1 passed); ts-packages/ preview-renderer integration tests (9 files, 165 tests). User confirmed lipsum-paragraph +react flow in browser: no panic, Q-3-43 surfaced, source qmd preserved. Note: the diagnostic surfacing itself depends on a companion fix to quarto-error-reporting (separate commit), which makes render_ariadne_source_context gracefully degrade when file reads aren't supported (WASM). See claude-notes/plans/2026-05-25-coarsened-entry-self-contained.md for the plan; claude-notes/designs/incremental-writer-internals.md for the contract this work pins ("every variant carries enough information to produce its emit bytes without further context").

…t when file read fails (WASM) render_ariadne_source_context panicked with "Failed to read file '…': operation not supported on this platform" whenever it tried to fetch source bytes for a file-backed location in WASM. The existing code unwrapped std::fs::read_to_string with a panic message, which works on native (the read genuinely succeeds) but crashes in WASM (no real filesystem; the call returns an Err unconditionally). Until now the panic was unreachable in practice — soft-drop diagnostics carrying file-backed Generated locations either weren't reaching the renderer (an upstream panic short-circuited the path) or weren't being surfaced in WASM contexts. The Q-3-43 soft-drop emitted from the UseAfter arm of the incremental writer (companion commit on this branch) now reliably surfaces, so the disk-read path actually runs inside q2-preview's iframe. Change: when std::fs::read_to_string fails, return None instead of panicking. The diagnostic's code, message, and hints still surface — only the Ariadne visual snippet is dropped. Verification: cargo nextest -p quarto-error-reporting (70/70). User confirmed Q-3-43 surfaces in the browser without panic after the WASM rebuild.

gordonwoodhull closed this May 25, 2026

gordonwoodhull reopened this May 25, 2026

gordonwoodhull added 26 commits May 24, 2026 23:05

gordonwoodhull added 16 commits May 24, 2026 23:05

docs(hub-client/changelog): plan-7 phases 4-6 entry

fceb862

docs(changelog): update plan-7 phases 4-6 commit hash after rebase

4ee51e4

gordonwoodhull force-pushed the feature/provenance branch from 318ab48 to 4ee51e4 Compare May 25, 2026 03:10

gordonwoodhull added 13 commits May 24, 2026 23:43

docs(changelog): note q2-preview sectionize-wrapper edit fix (bdcfdc5)

47c4c57

docs(changelog): note soft-drop diagnostic surfacing fix (5f2bbab)

3f96b39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: improve provenance and make q2-preview editable#231

feature: improve provenance and make q2-preview editable#231
gordonwoodhull wants to merge 67 commits into
mainfrom
feature/provenance

gordonwoodhull commented May 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gordonwoodhull commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gordonwoodhull commented May 22, 2026 •

edited

Loading