Semantic Gap Implementation Plan (2026-05-16)
Semantic Gap Implementation Plan (2026-05-16)
Section titled “Semantic Gap Implementation Plan (2026-05-16)”For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax.
Goal: Remediate the 7 HIGH-confidence semantic gaps found in docs/src/architecture/semantic-gap-audit-2026.md plus the 1 LOW-severity plan-doc fix. Findings F2–F6 collapse into one engineering motion (silent-DB-error policy in the orchestrator); F1 is its own surgical fix; F7 is a one-line edit; F8 is documentation.
Architecture: Three independent batches (A, B, C) that can ship as separate PRs. Order is by leverage: Batch B first (most user-visible — broken route manifests silently passing codegen), then Batch A (correctness of audit/budget/reliability trail), then Batch C (cosmetic plugin consistency + plan-doc cleanup).
Tech Stack: Rust 1.83 workspace, tracing crate for structured logging, vox_telemetry for event emit, cargo test --workspace for verification.
Batch B — Wire the dead-code validator in vox-codegen (F1)
Section titled “Batch B — Wire the dead-code validator in vox-codegen (F1)”Why first: the gap is user-visible at build time. Anyone authoring routes { } with a typo in a component name today gets clean codegen and a runtime error; after this batch, they get a fail-fast build error pointing at the bad symbol.
Files:
- Modify:
crates/vox-codegen/src/codegen_ts/route_manifest.rs:25-68 - Create:
crates/vox-codegen/tests/route_manifest_validation.rs(or extend an existing test file)
Task B1: Write the failing test
Section titled “Task B1: Write the failing test”-
Step 1: Identify or create a test file location. Search for existing tests of
try_emit_route_manifest_from_web_irincrates/vox-codegen/tests/andcrates/vox-codegen/src/codegen_ts/. If a test file exists for route_manifest, extend it. Otherwise createcrates/vox-codegen/tests/route_manifest_validation.rs. -
Step 2: Write a failing test that constructs a
WebIrModulewith aRouteContractreferencing a component name"NonExistentComponent", aHirModulethat does NOT defineNonExistentComponent, then callsvalidate_manifest_symbols(&web, &hir)and assertsErr(...)with a message containing the component name.
Reference for the WebIrModule + HirModule construction: look at existing tests in crates/vox-codegen/src/codegen_ts/ for the canonical fixtures pattern (search WebIrModule::new, HirModule::default, or wherever route fixtures are built).
The test body in skeleton form:
#[test]fn validate_manifest_symbols_flags_missing_component() { // Build a minimal WebIrModule with one route referencing "NonExistentComponent". let web = /* ... construct using existing fixture helpers ... */; let hir = /* ... HirModule that defines no components ... */;
let result = vox_codegen::codegen_ts::route_manifest::validate_manifest_symbols(&web, &hir);
let err = result.expect_err("expected validation error for missing component"); assert!( err.contains("NonExistentComponent"), "error should name the missing component; got: {err}" );}- Step 3: Run the test and verify it fails.
Run: cargo test -p vox-codegen --test route_manifest_validation -- validate_manifest_symbols_flags_missing_component
Expected: test FAILS with “expected validation error for missing component” because the current body returns Ok(()) unconditionally.
Task B2: Wire the existing dead-code validator into the public entry
Section titled “Task B2: Wire the existing dead-code validator into the public entry”-
Step 1: In
crates/vox-codegen/src/codegen_ts/route_manifest.rs, replace the body ofvalidate_manifest_symbols(lines 25–27) so it:- Extracts
component_names: BTreeSet<String>fromhir— i.e., the names of every component/JSX function defined in the HIR. (Locate the relevant accessor by looking at howemit_route_manifest_from_web_irconsumeshir— there will be a similar walk available, or one can be added in a small helper.) - Extracts
query_names: BTreeSet<String>fromhir— names of every@query-annotated function. - Calls
route_tree_top_contracts(web)to get the top-level route contracts. - Recursively calls the existing
validate_contract_branchfor each top contract, accumulating errors in aVec<String>. - If the error vector is non-empty, returns
Err(errors.join("\n")). Otherwise returnsOk(()).
- Extracts
-
Step 2: Remove the
#[allow(dead_code)]fromvalidate_contract_branch(line 29). -
Step 3: Run the failing test from B1 and verify it PASSES.
Run: cargo test -p vox-codegen --test route_manifest_validation -- validate_manifest_symbols_flags_missing_component
Expected: PASS.
- Step 4: Run the rest of the vox-codegen test suite to ensure no existing route manifest fixtures break.
Run: cargo test -p vox-codegen
Expected: no new failures. If existing snapshot tests fail because they used route fixtures with unresolved components (i.e., they were silently passing because of the broken validator), update those fixtures to be valid OR explicitly assert the new error and adjust them with intent.
Task B3: Add two more validation tests for coverage
Section titled “Task B3: Add two more validation tests for coverage”- Step 1: Add a test for the loader case — a route with a
loadermeta entry pointing to a non-@querysymbol should fail. - Step 2: Add a test for the pending case — a route with a
pendingmeta entry pointing to a non-existent component should fail. - Step 3: Add a test for the recursive case — a child route with a bad component should fail (verifies the recursion at line 65–67 fires).
- Step 4: Add a positive test — a valid route tree returns
Ok(()).
Run: cargo test -p vox-codegen --test route_manifest_validation
Expected: 4 tests pass.
Task B4: Commit
Section titled “Task B4: Commit”-
Step 1:
git addonly the modifiedroute_manifest.rsand the new/modified test file. -
Step 2:
git commit -m "fix(codegen): wire route-manifest validation that was sitting dead-code
validate_manifest_symbols was a public no-op (Ok(()) only); the realvalidation logic existed adjacent as #[allow(dead_code)] validate_contract_branchbut was never called. Broken route manifests with typoed component namesor unresolved loaders were passing codegen and surfacing as runtime errors.
This connects the existing impl to the public entry, removes the dead_codeattribute, and adds four targeted tests (missing component, bad loader,bad pending, recursion into children).
Refs: docs/src/architecture/semantic-gap-audit-2026.md F1"Batch A — Stop silently dropping Results on critical DB writes (F2–F6)
Section titled “Batch A — Stop silently dropping Results on critical DB writes (F2–F6)”Why second: the gaps are not user-visible at build time but corrupt the audit trail, reliability scoring, and budget enforcement under DB stress. The fix is mechanical: replace let _ = …await; with structured tracing::error! + event-bus emission so failures are observable, and for budget enforcement add a circuit-breaker hook.
The right kind of fix differs per call site:
- Reliability observations (F2, F3): These are fire-and-forget event-handler writes; the handler must not block on DB. Right fix: log + emit a
ReliabilityWriteFailedevent so observability captures the loss. - Lineage event (F4): Audit trail. Right fix: log + emit + consider buffering for retry, but at minimum surface to operators.
- Campaign init (F5): Setup work that downstream tasks depend on. Right fix: log + return an error to the caller OR emit a
CampaignInitFailedevent that triggers a hold on the task. - Budget exec-time (F6): Cost accounting that the budget gate trusts. Right fix: log + emit + (optionally) trigger circuit-breaker if multiple writes fail in a row.
The common piece is a small helper that wraps the discarded-result pattern. Build it once, apply it everywhere.
Task A1: Build the log_and_emit_persistence_failure helper
Section titled “Task A1: Build the log_and_emit_persistence_failure helper”Files:
-
Create:
crates/vox-orchestrator/src/services/persistence_obs.rs(new module) -
Modify:
crates/vox-orchestrator/src/services/mod.rs(export new module) -
Test:
crates/vox-orchestrator/src/services/persistence_obs.rs(inline#[cfg(test)] mod tests) -
Step 1: Write the failing test:
#[cfg(test)]mod tests { use super::*; use tracing_test::traced_test;
#[traced_test] #[test] fn log_and_emit_persistence_failure_logs_error_with_context() { let err: Box<dyn std::error::Error + Send + Sync> = "simulated db failure".into(); log_and_emit_persistence_failure( "reliability.endpoint_observation", &*err, None, ); assert!(logs_contain("reliability.endpoint_observation")); assert!(logs_contain("simulated db failure")); }}-
Step 2: Run:
cargo test -p vox-orchestrator persistence_obs::tests→ expect FAIL (module not yet defined). -
Step 3: Add
tracing-test = "0.2"as a[dev-dependencies]entry incrates/vox-orchestrator/Cargo.tomlif not already present. -
Step 4: Implement the helper:
//! Shared helper for observability around persistence failures in the orchestrator.//!//! Use this in place of `let _ = db.something().await;` whenever the discarded//! Result represents a meaningful write (audit trail, reliability observation,//! budget accounting, lineage). The helper logs with structured fields and//! optionally emits a `PersistenceFailed` event on the orchestrator event bus.//!//! Refs: docs/src/architecture/semantic-gap-audit-2026.md F2–F6.
use std::error::Error;use crate::event_bus::AgentEventKind;
/// Log a persistence-layer failure with structured context. Caller passes a/// short stable id for the operation site (e.g. "reliability.endpoint_observation",/// "lineage.task_submitted", "budget.exec_time"). If `event_bus` is Some, emits a/// PersistenceFailed event.pub fn log_and_emit_persistence_failure( op_id: &'static str, err: &(dyn Error + 'static), event_bus: Option<&crate::event_bus::EventBus>,) { tracing::error!( target: "vox.orchestrator.persistence", op = op_id, error = %err, "persistence write failed; data not recorded" ); if let Some(bus) = event_bus { bus.emit(AgentEventKind::PersistenceFailed { op_id: op_id.to_string(), error_message: err.to_string(), }); }}-
Step 5: Add the
PersistenceFailedvariant toAgentEventKindincrates/vox-orchestrator/src/event_bus.rs(locate the enum and append the variant; if there’s a snapshot-based test of the enum’s serialization, run it and update the snapshot intentionally). -
Step 6: Run test → expect PASS.
-
Step 7: Commit:
git commit -m "feat(orchestrator): add log_and_emit_persistence_failure helper
Centralizes the discarded-Result pattern around critical DB writes(reliability observations, audit lineage, budget accounting). Subsequentcommits replace let-underscore patterns with this helper.
Refs: docs/src/architecture/semantic-gap-audit-2026.md F2-F6"Task A2: Replace silent drops in services/reliability.rs (F2, F3)
Section titled “Task A2: Replace silent drops in services/reliability.rs (F2, F3)”Files:
-
Modify:
crates/vox-orchestrator/src/services/reliability.rs:33-41, 43-58, 75-83 -
Step 1: Write a failing test in
reliability.rs(or its existing test file) that:- Constructs a
ReliabilityServicewith a mockstorewhoserecord_endpoint_observationalways returnsErr. - Emits a
EndpointReliabilityObservationevent. - Asserts that
tracing::error!was emitted with the expectedop_idfield (usetracing_test::traced_test).
- Constructs a
Run: cargo test -p vox-orchestrator services::reliability → FAIL.
- Step 2: In
reliability.rs, replace eachlet _ = self.store.<op>(...).await;site (lines 33, 45, 49, 53, 57, 75) with:
if let Err(e) = self.store.<op>(...).await { crate::services::persistence_obs::log_and_emit_persistence_failure( "reliability.<op_name>", &e, self.event_bus.as_ref(), );}…where <op_name> is endpoint_observation for line 33/75, task_completed_obs for line 45, task_failed_obs for line 49, handoff_accepted_obs for line 53, handoff_rejected_obs for line 57.
If self.event_bus isn’t currently a field of ReliabilityService, either add it (preferred — matches the helper’s signature) or pass None as a transitional step (deferred event-bus wiring is acceptable here; document with a TODO referencing this plan).
-
Step 3: Run the test → PASS.
-
Step 4: Commit:
git commit -m "fix(orchestrator/reliability): stop silently dropping DB write failures
Six sites in reliability.rs (record_endpoint_observation x2 +record_task_reliability_observation x4 across match arms) were using\`let _ = ...await;\` to discard DB write Results. Failures now logstructured errors via log_and_emit_persistence_failure.
Refs: docs/src/architecture/semantic-gap-audit-2026.md F2, F3"Task A3: Replace silent drops in task_dispatch/submit/task_submit.rs (F4, F5)
Section titled “Task A3: Replace silent drops in task_dispatch/submit/task_submit.rs (F4, F5)”Files:
-
Modify:
crates/vox-orchestrator/src/orchestrator/task_dispatch/submit/task_submit.rs:1002-1014and:97-105 -
Step 1: For F4 (lineage event at line 1002), write a failing test (in the same crate’s tests/ or as an inline
#[cfg(test)]) that mocksappend_orchestration_lineage_eventto return Err, runs a task submission, and asserts thetracing::error!emit. -
Step 2: Replace the F4 site (line 1002–1014) with the
log_and_emit_persistence_failurepattern. -
Step 3: For F5 (line 97 —
begin_reconstruction_campaign), the policy decision is harder: should a failed campaign init block the task submission, or proceed without a campaign?- Recommendation: Block. A task submitted with a campaign_id that has no backing campaign row is a downstream lookup bug waiting to happen.
- Implementation: change the
let _ = self.begin_reconstruction_campaign(...).await;to:self.begin_reconstruction_campaign(...).await.map_err(|e| { ...log + return an explicit submission error... })?; - But this changes the function’s public error type, so audit the callers. If callers can’t tolerate a new error variant, fall back to the log-and-emit pattern PLUS unset
task.campaign_id = Noneso downstream code doesn’t look up a non-existent campaign.
-
Step 4: Add a test for the new F5 behavior (whichever path was chosen).
-
Step 5: Run all tests in
vox-orchestrator: Run:cargo test -p vox-orchestratorExpected: pass. -
Step 6: Commit:
git commit -m "fix(orchestrator/submit): stop dropping lineage + campaign-init DB errors
task_submit.rs:1002 (lineage append) and :97 (campaign init) were using\`let _ = ...await;\` to discard Results from writes that feed the audittrail and downstream campaign lookups. Lineage failures now log + emitPersistenceFailed; campaign-init failures now block submission (or cleartask.campaign_id, see code comment) to prevent dangling references.
Refs: docs/src/architecture/semantic-gap-audit-2026.md F4, F5"Task A4: Replace silent drop in budget/persistence.rs (F6)
Section titled “Task A4: Replace silent drop in budget/persistence.rs (F6)”Files:
-
Modify:
crates/vox-orchestrator/src/budget/persistence.rs:103 -
Step 1: Write a failing test that mocks
db.record_exec_timeto return Err, runs the budget-persistence path, asserts the log emit AND asserts a circuit-breaker counter (if applicable) incremented. -
Step 2: Replace the F6 site with the
log_and_emit_persistence_failurepattern. Because exec_time feeds budget enforcement, also add a metric / counter increment so operators can detect ongoing accounting loss. Look for an existing budget-metrics surface incrates/vox-orchestrator/src/budget/and emit through that; if none exists, defer the metric to a follow-up and just log + emit for now. -
Step 3: Run tests: Run:
cargo test -p vox-orchestrator budget::persistenceExpected: pass. -
Step 4: Commit:
git commit -m "fix(orchestrator/budget): stop dropping exec-time persistence errors
budget/persistence.rs:103 was using \`let _ = db.record_exec_time(...).await;\`to discard the Result. Silent loss under-counts agent usage against budgets.Failures now log + emit PersistenceFailed via log_and_emit_persistence_failure.
Refs: docs/src/architecture/semantic-gap-audit-2026.md F6"Task A5: Add a clippy lint that bans let _ = .*await on calls to known DB write methods (optional follow-up)
Section titled “Task A5: Add a clippy lint that bans let _ = .*await on calls to known DB write methods (optional follow-up)”Why: prevent this regression. Several patterns can be banned at lint level (e.g., clippy::let_underscore_must_use already exists for #[must_use] types — annotating the DB store methods with #[must_use] would surface this automatically).
-
Step 1: Audit the public methods on the orchestrator’s persistence store (e.g.,
crates/vox-orchestrator/src/store/*.rsor whereverrecord_endpoint_observation,record_task_reliability_observation,append_orchestration_lineage_event,record_exec_timeare defined). Add#[must_use = "persistence write Result must be handled — see semantic-gap-audit-2026.md"]to each. -
Step 2: Run
cargo clippy --workspace --all-targets -- -D warnings -A clippy::all -W clippy::let_underscore_must_use(or whatever the project’s clippy invocation pattern is). -
Step 3: Audit any new warnings that surface — they’re additional silent-drop sites this audit didn’t reach.
-
Step 4: Commit:
git commit -m "chore(orchestrator): annotate persistence writes with #[must_use]
Adds compile-time enforcement that the Result from DB write methods onthe persistence store cannot be silently discarded. Pairs with thelog_and_emit_persistence_failure helper as the runtime side of the policy.
Refs: docs/src/architecture/semantic-gap-audit-2026.md"Batch C — Plugin trait consistency + plan-doc cleanup (F7, F8)
Section titled “Batch C — Plugin trait consistency + plan-doc cleanup (F7, F8)”Why last: smallest and most cosmetic. Two unrelated edits packaged together.
Task C1: Make CloudSync::list_remote_json explicit about its scaffold state (F7)
Section titled “Task C1: Make CloudSync::list_remote_json explicit about its scaffold state (F7)”File:
-
Modify:
crates/vox-plugin-cloud/src/sync.rs:46-48 -
Step 1: Edit the function body from:
fn list_remote_json(&self, _remote_prefix: RStr<'_>) -> RResult<RString, RBoxError> { RResult::ROk(RString::from("[]"))}…to match the sibling pattern:
fn list_remote_json(&self, _remote_prefix: RStr<'_>) -> RResult<RString, RBoxError> { RResult::RErr(RBoxError::new(std::io::Error::other( "not yet implemented; SP7 scaffold", )))}-
Step 2: Run
cargo test -p vox-plugin-cloudandcargo test -p vox-plugin-catalog. If any test relied onlist_remote_jsonreturningOk("[]"), update that test to expect the explicit error (with a comment referencing this fix). -
Step 3: Commit:
git commit -m "fix(plugin-cloud): list_remote_json returns explicit not-implemented error
Was silently returning Ok(\"[]\") instead of matching its sibling methods(upload, download) which return RErr with \"not yet implemented; SP7 scaffold\".Callers can now distinguish \"no remote artifacts\" from \"feature not built.\"
Refs: docs/src/architecture/semantic-gap-audit-2026.md F7"Task C2: Fix the telemetry Phase B plan internal contradiction (F8)
Section titled “Task C2: Fix the telemetry Phase B plan internal contradiction (F8)”File:
-
Modify:
docs/superpowers/plans/telemetry/2026-05-09-telemetry-phase-b.md -
Step 1: Locate Task 3 in the Phase B plan. Add a note (one or two lines) that the trace-field population described in this task is forward-looking and that the actual wiring lands in Phase C. Reference the Phase C plan’s prerequisite line: “Phase B is merged (
ModelCallEventexists;current_trace_ctx()is being read byinfer.rsbut currently returns a default empty context)”. -
Step 2: Commit:
git commit -m "docs(telemetry): note that Phase B Task 3 trace fields are wired by Phase C
Phase B Task 3 reads as if the trace_id/task_id/parent_task_id/caller_agent_idfields are populated during Phase B; the same document's Phase C previewcontradicts that. Code follows the preview. This adds an inline notepointing at Phase C as the actual wiring task to resolve the contradiction.
Refs: docs/src/architecture/semantic-gap-audit-2026.md F8"Verification gate (run before opening the PRs)
Section titled “Verification gate (run before opening the PRs)”-
cargo test --workspacepasses (or only pre-existing failures unrelated to these changes). -
cargo clippy --workspace --all-targets -- -D warnings— passes; in particular, no newlet_underscore_must_usewarnings. -
cargo run -p vox-arch-checkpasses. - No hand-edited auto-generated docs (per project memory: SUMMARY.md / *.generated.md / feed.xml / .cursorignore are tool-regenerated only).
Out of scope for this plan
Section titled “Out of scope for this plan”- vox-code-audit’s 27
todo!()/unimplemented!()panics (claim from the prior audit, not personally verified in this round). Worth a separate focused plan after a per-rule inventory. - arXiv submission automation (
vox-publisher/src/submission/arxiv.rs:26). The “operator-assist” mode is documented; whether to automate it is a v0.6 scope decision, not a defect fix. - MENS Batch 3 stubs in
vox-plugin-mens-candle-cuda/. These are explicitly tracked as SP3-deferred and belong to the existing MENS distributed plan (docs/src/architecture/mesh-mens-distributed-training-and-execution-plan-2026.md). - Telemetry Phase C wiring in
crates/vox-orchestrator-mcp/src/llm_bridge/infer.rs:504-507. Already covered by the existing 13-task Phase C plan.
Estimated effort
Section titled “Estimated effort”| Batch | Effort | Risk | Reviewer time |
|---|---|---|---|
| B (codegen validator wiring) | 2–4 hours | low (additive validation; only breaks malformed inputs) | 30 min |
| A (orchestrator silent-drop fixes) | 6–10 hours | medium (touches hot paths; need to verify event_bus is correctly threaded; budget circuit-breaker is the trickiest) | 60–90 min |
| C (cloud plugin + plan doc) | 30 min | trivial | 15 min |
Total: 1–2 engineering days for one familiar engineer, plus ~2 hours of review.
Companion artifacts
Section titled “Companion artifacts”- Findings:
docs/src/architecture/semantic-gap-audit-2026.md - Audit-of-audit (prior round’s negative-result doc):
docs/src/architecture/ai-laziness-remediation-plan-2026.md - Existing plans this work coordinates with:
docs/superpowers/plans/telemetry/2026-05-09-telemetry-phase-{b,c}.md - Auto-memory:
~/.claude/projects/C--Users-Owner-vox/memory/feedback_verify_audit_retirement_claims.md(verification rules used during this round).