Skip to content

Repository layout sprawl audit (2026)

This document complements repo-cleanup-ledger-2026.md (which targeted tracked orphan artifacts and surface moves). Here the focus is taxonomy and navigation: too many folders that read like separate products but hold almost nothing, overlapping meanings (infra/ vs docker/ vs root compose files), and who produces vs ingests files that are not Rust sources.

Metrics (tracked files only, snapshot 2026-05-11)

Section titled “Metrics (tracked files only, snapshot 2026-05-11)”
MetricApproximate valueInterpretation
Immediate parent dirs with exactly one tracked file~203Many are intentional (single crate roots like crates/vox-build-meta/, VS Code feature folders, leaf contract domains). Treat as a triage list, not automatic deletes.
contracts/ breadth342 tracked files across ~37 second-level domainsSSOT is deliberate; shrinking folder count requires contracts/index.yaml + consumer path updates, not ad hoc moves.
Root-level tracked files~40 config / policy stubsHigh cognitive load for newcomers; most must stay at repo root for Cargo / CI / IDE tooling.

Re-run a sparse-dir report anytime:

Terminal window
git ls-files | ForEach-Object { Split-Path $_ -Parent } |
Group-Object | Where-Object Count -eq 1 | Sort-Object Name
  1. Cargo / Rust module granularity — One file per directory under crates/*/src/** is normal; VS Code extension mirrors domain folders (apps/editor/vox-vscode/src/chat/ …).
  2. Contract federation — Each contracts/<domain>/ is often a bounded SSOT (contracts/naming, contracts/workflow, …) referenced by hard-coded paths in crates and indexed in contracts/index.yaml.
  3. Documentation explosiondocs/ is intentionally deep; Astro sidebar is driven by frontmatter, not folder count.
  4. Operational duplication — Compose and Docker material intentionally appears in infra/ (Coolify / Populi), docker/ (compose-relative paths for eval/SearXNG), and the repo root (docker-compose.yml, vox-eval.compose.yml) so operators can run docker compose -f … from different working directories.
GroupPathsPurpose
Rust workspaceCargo.toml, crates/, rust-toolchain.tomlPrimary implementation; arch enforcement via layers.toml + vox-arch-check.
Contracts SSOTcontracts/, especially contracts/index.yamlMachine-readable policies and schemas; vox ci contracts-index and domain-specific guards.
Human docsdocs/src/Authoritative prose + architecture; doctests via doc pipeline.
Docs site builddocs-astro/Astro app; generated sidebar — do not hand-edit SUMMARY.md.
GUI / appsapps/Mental tracker, editor extension, interop marquee, experimental visualizer — ownership in contracts/frontend/surface-ownership.v1.yaml.
Examples & fixturesexamples/, tests/fixtures/Sandboxes and test-only bundles; not shipped product trees.
Automationscripts/*.voxVoxScript-first glue (vox run …); thin bootstrap PS/sh only.
CI / policy entrypoints.github/workflows/, lefthook.yml, deny.toml, biome.jsonExternal runners and repo-wide lint gates.
Deploy / opsinfra/, docker/, root Dockerfile*, docker-compose.ymlOverlap is documented (eval sandbox: root compose + mirror under docker/). Prefer documenting canonical path over silent merges.
Build-time helpersapps/build-tools/render-durable-animation/Small Node helper for doc assets; driven by scripts/render-durable-animation.vox.
Training / Mensmens/Corpus config + local training runs (mens/runs/ gitignored).

Non-Rust artifacts — producer / consumer matrix

Section titled “Non-Rust artifacts — producer / consumer matrix”

Only high-traffic families are listed; extend this table when consolidating a directory.

Artifact(s)Produced byIngested by
docs/agents/doc-inventory.jsoncargo run -p vox-cli -- ci doc-inventory generate (vox-doc-inventory walks crates/, docs/, apps/editor/vox-vscode, scripts/, .github/workflows/ — see walk.rs)vox ci doc-inventory verify, agents / IDE context policies
contracts/index.yaml + contracts/index.schema.jsonHuman editors + CI generators (vox ci operations-sync, capability sync, …)vox ci contracts-index, multiple crates via stable path literals
Root vox.tokens.jsonHuman / designcontracts/tokens/tokens.v1.json schema; TS/CSS/codegen in vox-codegen
docs/src/SUMMARY.md, docs/src/feed.xmlAstro build (gitignored committed stubs per AGENTS.md)Docs site
.cursorignore, .aiignore, …vox ci sync-ignore-files from .voxignoreIDE exclusion surfaces
contracts/toestub/suppressions.v1.jsonHumans + audit toolingvox-code-audit, CI
docker/**, infra/** compose filesHumans / opsvox research infra helper (infra.rs), Coolify, deployment docs

Tier S — safe wins (documentation + pointers)

Section titled “Tier S — safe wins (documentation + pointers)”
  • Keep overlapping compose paths but link one canonical row in where-things-live.md (done in same PR as this audit).
  • When adding new ops material, prefer infra/ for long-form deployment docs + compose unless you need docker compose -f docker/... path stability (eval/SearXNG).

Tier M — structural (requires reference sweep)

Section titled “Tier M — structural (requires reference sweep)”
  • Merge thin contracts/<x>/ domains only when a domain has ≤2 files and shares an owner with an adjacent domain — must update contracts/index.yaml, rg contracts/old-path, and any crate literals.
  • Fold tools/ into apps/build-tools/ — done for render-durable-animation; keep new adjunct CLIs under apps/build-tools/.
  • Flattening crates/*/src module folders — fights Rust idioms and review ergonomics.
  • Moving root Cargo.toml, rustfmt.toml, deny.toml, etc. — breaks ecosystem defaults.
  1. Triage the ~203 single-file parent dirs: tag each as idiomatic | candidate merge | generated — baseline list in repo-layout-single-file-parent-dirs-triage-2026.md.
  2. For each Tier M move: attach an rg evidence block (reference count) in the PR description.
  3. Extend this matrix when introducing new generated JSON/YAML under contracts/ or docs/agents/.