Vox Speech Improvement Backlog 2026
Vox Speech Improvement Backlog 2026
Section titled “Vox Speech Improvement Backlog 2026”Backlog rows are ranked by expected user impact and ability to improve ASR-primary audit confidence. Rankings now incorporate the measured full runtime suite at .vox/audit/2026-05-11-oratio-full-runtime/.
| ID | Gap | Owner | Category | Expected lift | Effort | Dependencies |
|---|---|---|---|---|---|---|
| STB-001 | Add dashboard microphone capture that reaches vox_speech_to_code, or change Loquela copy to text-chat until implemented. | vox-dashboard, vox-orchestrator-mcp | orchestration | High UX lift, compile pass measurable from dashboard | M | MCP tool availability in dashboard transport |
| STB-002 | Restore or retire vox-audio-ingress references and reassign orphaned env vars. | contracts/config, vox-oratio | orchestration | Removes false transport surface | S | Contract owner decision |
| STB-003 | Add committed speech canary cell to CI instead of optional VOX_SPEECH_CANARY_KPI only. | vox-cli, vox-integration-tests | orchestration | High regression-detection lift | M | Corpus v1 and canary KPI |
| STB-004 | Add runtime CUDA decode parity test against CPU for one short fixture. | vox-oratio, vox-cli | acoustic | Medium WER confidence on GPU systems | M | CUDA runner availability |
| STB-005 | Add editor webview synthetic-audio harness for 48 kHz WAV path. | apps/editor/vox-vscode | acoustic | High capture-format confidence | M | Extension test harness |
| STB-006 | Wire mobile Android/iOS STT into shared corpus parity reporting. | apps/vox-mental-tracker | lexical | High cross-system accuracy visibility | L | Device/emulator CI lane |
| STB-007 | Replace prompt fallback classification in Vox app tests with explicit ASR=N/A. | apps/vox-mental-tracker | orchestration | Reduces false-positive ASR scoring | S | E2E update |
| STB-008 | Promote symbol_error_rate into CLI scorecards for identifier-heavy speech. | vox-oratio, vox-populi | lexical | Medium code-domain accuracy lift | S | Scorecard schema extension |
| STB-009 | Add a real silence/no-speech model canary for Candle. | vox-oratio | acoustic | High hallucination prevention | M | Model cache / offline fixture |
| STB-010 | Add streaming WS contract test for partial/final events and backpressure. | vox-oratio | orchestration | Medium reliability lift | M | serve feature lane |
| STB-011 | Replace synthetic WAV fixtures with real spoken audio for code dictation, commands, identifiers, mixed-natural, and noisy domains. | contracts/speech-to-code, tests/speech-to-code | acoustic | Done for the current Windows SAPI 16 kHz fixture set; keep open for human/device recordings. | M | Consent-cleared corpus recording or generated speech fixtures |
| STB-012 | Add a first-class vox ci speech-runtime-suite runner that emits per-cell JSON/KPI artifacts without ad hoc shell commands. | vox-cli, vox-integration-tests | orchestration | Done for matrix classification + CPU Candle runtime eval; extend as new harnesses land. | M | Stable runtime matrix and artifact schema |
| STB-013 | Fix Windows Candle CUDA linking for vox-plugin-oratio --features cuda. | vox-plugin-oratio, patches/candle-* | orchestration | Enables GPU SHOULD cell measurement on Windows. | M | Resolve moe_gemm_* link symbols or gate incompatible kernels |
| STB-014 | Expose a real Oratio streaming route or remove the advertised WS stream URL from MCP status. | vox-oratio, vox-orchestrator-mcp | orchestration | Eliminates streaming false positive and enables partial/final tests. | M | Transport contract decision |
Sequencing
Section titled “Sequencing”- Land STB-003 next so the generated passing CPU Candle KPI is enforced rather than optional.
- Land STB-002, STB-007, and STB-014 to remove false-positive runtime surfaces.
- Land STB-005 and STB-009 to close the highest-risk acoustic harness gaps.
- Land STB-013, STB-004, and STB-006 when hardware/device runners are available.
Definition Of Done
Section titled “Definition Of Done”Each backlog row is complete only when it has:
- A failing test or failing audit cell before implementation.
- A passing verification command after implementation.
- A scorecard delta or documented skip reason.
- An entry in the speech audit findings doc if it changes audit interpretation.