Skip to content

Vox Speech Improvement Backlog 2026

Backlog rows are ranked by expected user impact and ability to improve ASR-primary audit confidence. Rankings now incorporate the measured full runtime suite at .vox/audit/2026-05-11-oratio-full-runtime/.

IDGapOwnerCategoryExpected liftEffortDependencies
STB-001Add dashboard microphone capture that reaches vox_speech_to_code, or change Loquela copy to text-chat until implemented.vox-dashboard, vox-orchestrator-mcporchestrationHigh UX lift, compile pass measurable from dashboardMMCP tool availability in dashboard transport
STB-002Restore or retire vox-audio-ingress references and reassign orphaned env vars.contracts/config, vox-oratioorchestrationRemoves false transport surfaceSContract owner decision
STB-003Add committed speech canary cell to CI instead of optional VOX_SPEECH_CANARY_KPI only.vox-cli, vox-integration-testsorchestrationHigh regression-detection liftMCorpus v1 and canary KPI
STB-004Add runtime CUDA decode parity test against CPU for one short fixture.vox-oratio, vox-cliacousticMedium WER confidence on GPU systemsMCUDA runner availability
STB-005Add editor webview synthetic-audio harness for 48 kHz WAV path.apps/editor/vox-vscodeacousticHigh capture-format confidenceMExtension test harness
STB-006Wire mobile Android/iOS STT into shared corpus parity reporting.apps/vox-mental-trackerlexicalHigh cross-system accuracy visibilityLDevice/emulator CI lane
STB-007Replace prompt fallback classification in Vox app tests with explicit ASR=N/A.apps/vox-mental-trackerorchestrationReduces false-positive ASR scoringSE2E update
STB-008Promote symbol_error_rate into CLI scorecards for identifier-heavy speech.vox-oratio, vox-populilexicalMedium code-domain accuracy liftSScorecard schema extension
STB-009Add a real silence/no-speech model canary for Candle.vox-oratioacousticHigh hallucination preventionMModel cache / offline fixture
STB-010Add streaming WS contract test for partial/final events and backpressure.vox-oratioorchestrationMedium reliability liftMserve feature lane
STB-011Replace synthetic WAV fixtures with real spoken audio for code dictation, commands, identifiers, mixed-natural, and noisy domains.contracts/speech-to-code, tests/speech-to-codeacousticDone for the current Windows SAPI 16 kHz fixture set; keep open for human/device recordings.MConsent-cleared corpus recording or generated speech fixtures
STB-012Add a first-class vox ci speech-runtime-suite runner that emits per-cell JSON/KPI artifacts without ad hoc shell commands.vox-cli, vox-integration-testsorchestrationDone for matrix classification + CPU Candle runtime eval; extend as new harnesses land.MStable runtime matrix and artifact schema
STB-013Fix Windows Candle CUDA linking for vox-plugin-oratio --features cuda.vox-plugin-oratio, patches/candle-*orchestrationEnables GPU SHOULD cell measurement on Windows.MResolve moe_gemm_* link symbols or gate incompatible kernels
STB-014Expose a real Oratio streaming route or remove the advertised WS stream URL from MCP status.vox-oratio, vox-orchestrator-mcporchestrationEliminates streaming false positive and enables partial/final tests.MTransport contract decision
  1. Land STB-003 next so the generated passing CPU Candle KPI is enforced rather than optional.
  2. Land STB-002, STB-007, and STB-014 to remove false-positive runtime surfaces.
  3. Land STB-005 and STB-009 to close the highest-risk acoustic harness gaps.
  4. Land STB-013, STB-004, and STB-006 when hardware/device runners are available.

Each backlog row is complete only when it has:

  • A failing test or failing audit cell before implementation.
  • A passing verification command after implementation.
  • A scorecard delta or documented skip reason.
  • An entry in the speech audit findings doc if it changes audit interpretation.