Populi remote execution rollout checklist
Populi remote execution rollout checklist
Section titled “Populi remote execution rollout checklist”Use this checklist before widening Populi remote execution beyond local-first defaults—whether using today’s experimental relay or a future lease-authoritative path (ADR 017).
Default-off validation
Section titled “Default-off validation”- Documented scope: confirm the deployment matches a column in the work-type placement matrix (local / LAN / overlay).
- No accidental public bind: Populi listeners and MCP HTTP gateways use loopback or controlled ingress unless TLS and auth are in place (deployment compose SSOT, MCP HTTP gateway contract).
- Secrets: mesh tokens and JWT secrets live in Secrets / secret stores;
vox secrets doctorpasses for required workflows (Secrets SSOT).
Kill switches (validate in staging)
Section titled “Kill switches (validate in staging)”Prove you can disable remote paths without redeploying code:
| Switch | Effect (current docs) |
|---|---|
VOX_ORCHESTRATOR_MESH_REMOTE_EXECUTE_EXPERIMENTAL=0 (unset/false) | Disables experimental RemoteTaskEnvelope relay; local execution unchanged (orchestration unified). |
VOX_ORCHESTRATOR_MESH_ROUTING_EXPERIMENTAL=0 | Disables hint-based routing score experiments (mens SSOT). |
VOX_ORCHESTRATOR_MESH_CONTROL_URL unset | Stops federation node snapshot reads from Populi (orchestrator/MCP) (env vars). |
VOX_MESH_HTTP_JOIN=0 | MCP skips HTTP join/heartbeat while other mesh hooks may still run (mens SSOT). |
VOX_MESH_ENABLED=0 | Disables mens hooks in processes that respect this flag (mens SSOT). |
Staging drill: toggle each relevant switch, restart or reload the affected process per your platform, and confirm no remote fan-out and no unexpected control-plane traffic (packet capture or access logs).
Functional gates (pilot)
Section titled “Functional gates (pilot)”- Single owner: for lease-backed task classes (when implemented), reproduce lease acquisition, renewal, and expiry; confirm no concurrent execution on two nodes for the same correlation id.
- Fallback: on lease loss, verify local fallback or documented fail-closed behavior per operator policy (ADR 017).
- Cancellation: remote cancel paths propagate within agreed timeouts.
- Results: result or failure delivery is idempotent on redeliver (mesh idempotency_key where used).
Observability gates
Section titled “Observability gates”- Logs or traces include
task_id(or equivalent) for routed work; when lease placement ships, includelease_idand placement reason per placement observability. - Optional:
VOX_MESH_CODEX_TELEMETRYemitspopuli_control_eventrows without storing bearer material (mens SSOT).
Regression and rollback
Section titled “Regression and rollback”- CI / smoke:
vox ci check-linksand mdBook build succeed after doc changes; workspace tests for Populi/orchestrator crates pass for the PR that enables new behavior. - Rollback plan: document which env toggles return the fleet to local-only execution and who is allowed to flip them.
Go / no-go
Section titled “Go / no-go”| Outcome | Condition |
|---|---|
| Go | Kill-switch drill passed; matrix row matches workload; observability fields confirmed in pilot logs. |
| No-go | Any unexplained duplicate execution, missing fallback on forced partition, or inability to disable relay via env within minutes. |
Related documentation
Section titled “Related documentation”- Overlay personal cluster runbook
- Populi GPU mesh implementation plan 2026 — roadmap sequencing