Skip to content

Compiler Architecture

The Vox compiler follows a modular pipeline architecture with conceptual stages. The current implementation is consolidated under crates/vox-compiler/src/, where each stage is represented by explicit modules.

Current implementation note: the practical pipeline is currently consolidated under crates/vox-compiler/src/ for lexer, parser, AST, HIR, typecheck, and emitters. This document keeps conceptual stage boundaries while implementation modules may live in one crate.


Source Code (.vox)
┌────────────────┐
│ Lexer │ Tokenization (logos)
└──────┬─────────┘
│ Vec<Token>
┌────────────────┐
│ Parser │ Recursive descent parser → AST Module
└──────┬─────────┘
│ Module (AST root)
┌────────────────┐
│ AST │ Strongly-typed AST wrappers
└──────┬─────────┘
│ Module (Decl, Expr, Stmt, Pattern)
┌────────────────┐
│ HIR │ Desugaring + name resolution + dead code detection
└──────┬─────────┘
│ HirModule
┌────────────────┐
│ Typeck │ Bidirectional type checking + HM inference
└──────┬─────────┘
│ Typed HIR + Vec<Diagnostic>
┌────────────────┐
│ Web IR │ HIR→WebIR lower + validate
└──────┬─────────┘
│ WebIrModule
┌────────────────┐
│ App Contract │ HIR→AppContract (HTTP/RPC/server config)
└──────┬─────────┘
│ AppContractModule
┌────────────────┐
│ Runtime Proj │ HIR→RuntimeProjection (DB/task capability hints)
└──────┬─────────┘
│ RuntimeProjectionModule
┌──────────────────┬─────────────────────┐
│ vox-codegen-rust │ vox-codegen-ts │
│ (quote! → .rs) │ (string → .ts/tsx) │
└──────────────────┴─────────────────────┘

Current path note:

  • codegen_ts is still the production TS emitter path.
  • VOX_WEBIR_VALIDATE defaults on (WebIR lower/validate gate); set =0 / false / no / off to skip.
  • app_contract::project_app_contract is the SSOT for route/RPC/server-config codegen inputs (via projection_bundle in emit paths).
  • runtime_projection::project_runtime_from_hir is the SSOT for orchestration-facing DB capability projection (also bundled).
  • Reactive view: uses the Web IR TSX bridge when validation is clean; VOX_WEBIR_EMIT_REACTIVE_VIEWS was removed — there is no legacy-only emit path (see reactive.rs).

Vox has a native ML training loop powered by Burn (a pure-Rust deep learning framework):

docs/src/*.md + examples/*.vox
vox mens corpus extract # produces validated.jsonl
vox mens corpus pairs # produces train.jsonl (instruction-response pairs)
vox mens train # native Burn / HF path (default CLI features)
mens/runs/v1/model_final.bin

The training loop is defined in crates/vox-cli/src/training/native.rs.


Purpose: Converts source text into a flat stream of tokens.

Implementation: Uses the logos crate for high-performance, zero-copy tokenization.

Output: Vec<Token> — each token carries its kind and span.


Purpose: Transforms a token stream into an AST module.

Implementation: A hand-written recursive descent parser producing ast::decl::Module. The parser is resilient to errors, meaning it continues parsing after encountering invalid syntax — this is critical for LSP support, where the user is actively typing.

Key features:

  • Error recovery with synchronization points
  • Trailing comma support in parameter lists
  • Duplicate parameter name detection
  • Indentation-aware formatting (indent.rs)

See crates/vox-compiler/src/parser/descent/mod.rs for the implementation entrypoint.

Output: Module (AST root) with source spans on declarations and expressions.


Purpose: Strongly-typed wrappers around the untyped CST nodes.

See crates/vox-compiler/src/ast/ for the node hierarchy.


Emits Rust source using the quote! macro. Each decorator maps to specific Rust constructs:

VoxGenerated Rust
@endpoint fnAxum handler + route registration
@table typeStruct + SQLite schema
@test fn#[test] function
@deprecated#[deprecated] attribute
actorTokio task + mpsc mailbox
workflowPlain async function today; interpreted runtime provides partial durable step recording

TypeScript Codegen (vox-compiler::codegen_ts)

Section titled “TypeScript Codegen (vox-compiler::codegen_ts)”

Emits TypeScript/TSX in modular files:

ModuleOutput
jsx.rsReact JSX components
component.rsComponent declarations and hooks
activity.rsActivity/workflow client wrappers
emitter.rsTanStack Router trees, optional server fns, islands metadata
adt.rsTypeScript discriminated union types

Normative strategy for reducing frontend emitter complexity while preserving React interop: ADR 012 — Internal web IR strategy. Detailed implementation sequencing and weighted task quotas: Internal Web IR implementation blueprint. Ordered file-by-file execution map: WebIR operations catalog. Canonical current-vs-target representation mapping: Internal Web IR side-by-side schema. Quantified K-complexity delta for the canonical worked app: WebIR K-complexity quantification. Reproducible per-token-class computation: WebIR K-metric appendix.


CratePurpose
vox-clivox command-line entry point — see ref-cli.md for the implemented subcommand set
vox-lspLanguage Server Protocol implementation
vox-actor-runtimeTokio/Axum runtime: actors, scheduler, subscriptions, storage
vox-packagePackage manager: CAS store, dependency resolution, caching
vox-dbDatabase abstraction layer
vox-gamifyGamification system
vox-orchestratorMulti-agent orchestration
vox-code-auditAI anti-pattern detector
vox-tensorNative ML tensors via Burn 0.19 (Wgpu/NdArray backends)
vox-evalAutomated evaluation of training data quality
vox-doc-pipelineRust-native doc extraction + SUMMARY.md generation
vox-integration-testsEnd-to-end pipeline tests

The full checklist for adding a new language construct:

  1. Lexer — Add tokens to crates/vox-compiler/src/lexer/token.rs
  2. Parser — Add grammar rules in crates/vox-compiler/src/parser/descent/
  3. AST — Add node types in crates/vox-compiler/src/ast/
  4. HIR — Map AST → HIR in crates/vox-compiler/src/hir/lower/
  5. Type Check — Add inference rules in crates/vox-compiler/src/typeck/
  6. WebIR — Add/update lowering + validation semantics in crates/vox-codegen/src/web_ir/ when the feature affects web-facing behavior
  7. Codegen — Emit code in both crates/vox-compiler/src/codegen_rust/ and crates/vox-codegen/src/codegen_ts/
  8. Test — Add integration coverage in vox-integration-tests/tests/ and WebIR/parity coverage where applicable
  9. Docs — Add frontmatter + code example in docs/src/
  10. Training — Run vox mens corpus extract to include the new construct in ML data