Skip to content

Deep Research Prior Art and Vox Integration Roadmap (2026)

Deep Research Prior Art and Vox Integration Roadmap (2026)

Section titled “Deep Research Prior Art and Vox Integration Roadmap (2026)”

Deep research (working definition for Vox) is an orchestrated pipeline that:

  1. Plans — decomposes a user topic into sub-queries (and optionally iterative refinements).
  2. Retrieves — searches the open web and/or local corpora with policy-gated backends (SearXNG → DuckDuckGo → optional Tavily; optional HTML extraction via web-scrape).
  3. Iterates — when evidence is weak, expands queries (CRAG-style) up to a bounded hop count.
  4. Grounds — extracts claims, optionally verifies them against sources (when wired).
  5. Synthesizes — produces a cited answer and structured metadata (routing tier, diagnostics, judge score).

Optional dimensions aligned with commercial products: human checkpoints, async/long-running jobs, and mesh-durable execution. For Vox, mesh-durable execution is a forward hook only: @durable / workflow / activity are parsed and lowered per AGENTS.md §Grammar Unification, but durable replay/cron semantics are not production-complete — see durability-runtime-audit-2026.md and ADR-028 proposal.

Strategic anchor: The SCIENTIA self-publication program targets longitudinal provider observability and publication-quality outputs (scientia-self-publication-finalization-plan-2026.md). The deep-research pipeline is the substrate that can feed evidence bundles into that loop when paired with scientia-mesh-integration-research-2026.md signal families (DiscoverySignalFamily, FindingCandidateClass).

Non-duplication: Tavily endpoint shapes, secrets lifecycle, pricing, fail-open rules, and Firecrawl comparison live in docs/src/reference/tavily-integration-ssot.md. This document links there instead of copying tables.


2. Disambiguation: what people mean by “Claw”

Section titled “2. Disambiguation: what people mean by “Claw””

Voice transcription often yields “Claw” without specifying the product. Three distinct references appear in industry and this repo:

RowWhat it isTypical UXVox relevance
A. OpenClaw Deep Research AgentSkill/agent pattern in the OpenClaw ecosystem (multi-round web search, structured report, configurable iterations).Async-ish batch runs (minutes), markdown/HTML output.Closest analog: vox-search::crag::CragRouter + WebSearchDispatchershipped in vox-search, must be driven from orchestrator research (§6).
B. SearchClaw / ScienceClaw / ClawHub “Academic Deep Research”Research harnesses with explicit quality gates (citation counts, source diversity), many literature APIs, checkpointed workflows.Long runs, academic citation styles.Partial: diagnostics exist in run_research (RetrievalDiagnostics); not built: minimum citation diversity enforcement, APA tooling, 77-database integrations.
C. Anthropic Claude Research modeHosted Claude capability for web research with inline citations (consumer + API surfaces).Sync/async report with citations.Not orchestrated by Vox: we may call Anthropic as an LLM backend for synthesis/judge (stages.rs) but do not invoke Claude’s hosted “Research” product as a black box.

OpenClaw docs refresh for operators already exists on the CLI path (db_research/refresh.rsOPENCLAW_REFRESH_URLS).


Columns: triggering UX, planning, retrieval tools, memory/session, citations, cost/latency, access, limitations, Vox analog (file or verdict).

3.1 Google Gemini Deep Research / Deep Research Max

Section titled “3.1 Google Gemini Deep Research / Deep Research Max”
DimensionNotes
UXConsumer app + API (Interactions / agent surfaces); “Max” variant emphasizes higher search/token budgets and async completion.
PlanningIterative plan → search/read → gap fill → report.
ToolsWeb search/browse; MCP connectors; Workspace connectors in consumer SKU.
MemorySession-bound; export report artifacts.
CitationsReport-style citations (implementation details are vendor-side).
Cost/latencyHigh token + many search steps on “Max”; vendor-metered.
AccessGoogle AI / Gemini API; Google account / Cloud billing.
LimitsVendor lock-in; enterprise data residency policies; eval claims require primary citations.
Vox analogplanner.rs::decompose_query_with_configSTUB (passthrough). Retrieval: web_gather.rsnow delegates to vox-search web tier (Phase 1 shipped in this workstream).

Primary sources (appendix §10).

3.2 OpenClaw Deep Research Agent (skill ecosystem)

Section titled “3.2 OpenClaw Deep Research Agent (skill ecosystem)”
DimensionNotes
UXSkill/config driven; multi-round search (often ~5), cross-source validation narrative.
PlanningPrompt/scaffolding defines rounds and output shape.
ToolsGateway-discovered tools + HTTP skills; web search depends on deployment.
MemorySkill/session dependent.
CitationsMarkdown reports with links.
Cost/latencyToken-heavy; operator-hosted gateway.
AccessOpenClaw gateway + skills marketplace/docs.
LimitsEcosystem fragmentation; skill quality varies by publisher.
Vox analogCragRouter + policy web_search_max_hops — reuse from orchestrator after initial web gather (Phase 2 in this workstream).

3.3 SearchClaw / ScienceClaw / ClawHub academic flows

Section titled “3.3 SearchClaw / ScienceClaw / ClawHub academic flows”
DimensionNotes
UXBenchmark-oriented harnesses (e.g. BrowseComp claims for SearchClaw); academic checkpoints.
PlanningDecomposition + structured evidence trails.
ToolsMany APIs (Semantic Scholar, arXiv, news, …).
MemoryPersistent harness state across sessions (paper narrative).
CitationsMinimum counts / diversity constraints (SearchClaw “harness engineering”).
Cost/latencyAPI-rate-limit sensitive.
AccessGitHub / skill hubs.
LimitsOps burden to keep API keys and rate limits healthy.
Vox analogNot built as a dedicated harness; closest telemetry is RetrievalDiagnostics + future citation-diversity gate (Phase 2 backlog).
DimensionNotes
UXHosted research reports from Claude apps/API.
PlanningClosed-source agent loop.
ToolsWeb search / browsing (vendor-side).
CitationsInline citations in output.
LimitsNot portable across providers; policy constraints.
Vox analogNot integrated — Vox keeps retrieval in vox-search and uses LLM endpoints for synthesis/judge only.

All endpoint and pricing tables: tavily-integration-ssot.md.

EndpointIn Vox today
/searchYes — TavilySearchClient::search via WebSearchDispatcher Tier 4 when policy enables and prior tiers empty.
/extractNot wired in orchestrator research (future: weak-snippet uplift).
/researchNot wired (would collapse multi-hop into one vendor call; evaluate cost/benefit vs native CRAG).
/crawlNot wired into research pipeline (doc ingestion uses other paths).
ProductRoleVox stance
Perplexity Pro / ChatGPT Deep Research / You.comClosed UX + vendor search stacksBenchmark UX only; no dependency for core pipeline.
Exa / Bright Data SERPAlternative search/extract vendorsPolicy comparison only; Tavily SSOT already notes SERP patterns.

Canonical retrieval policy and corpus matrix: search-retrieval-ssot-2026.md.

SourceIn repo todaySlotSecrets / env
SearXNGTier 2 in web_dispatcher.rs; sidecar via vox research up (research/infra.rs)Primary self-hosted web tierVOX_SEARCH_SEARXNG_URL etc. via SearchPolicy::from_env
DuckDuckGoTier 3 fallbackFree fallback when SearXNG empty/failsPolicy toggle duckduckgo_fallback_enabled
TavilyTier 4 when configuredLow-friction ranked snippetsTavilyApiKey + VOX_SEARCH_TAVILY_* — see Tavily SSOT
Wikipedia / WikidataNot wiredTier 1.5 high-trust factual blurbsFuture: register read-only HTTP (likely no secret); add env registry row in contracts/config/env-vars.v1.yaml if introducing VOX_SEARCH_WIKI_* toggles
arXiv APINot wiredSTEM literature sliceFuture SecretId only if using authenticated tier
Crossref RESTNot wiredDOI metadataPolite pool + optional mailto — register env var if adding
Semantic Scholar Graph APINot wiredCitation expansionPlan mentions SecretId::SemanticScholarApiKeynot implemented
Internet Archive WaybackNot wiredDead-link recoveryRespect IA terms; throttle

5. Gap analysis — stubs vs shipped surfaces

Section titled “5. Gap analysis — stubs vs shipped surfaces”

5.1 PHASE_0a_STUB modules in crates/vox-orchestrator/src/dei_shim/research/

Section titled “5.1 PHASE_0a_STUB modules in crates/vox-orchestrator/src/dei_shim/research/”
ModuleRole todayReplacement / delegation target
planner.rsPassthrough single subqueryFuture: LLM/Mens decomposition — not replaced in this workstream
provider.rsEmpty search/map_siteFuture: unify with vox-search providers / mesh ProviderObservationnot replaced here
web_gather.rsWas emptyWebSearchDispatcher::search + CRAG refinements (web_dispatcher.rs, crag.rs) — implemented
claims.rsEmpty claimsFuture vox-claim-extractor per module header — stub
verifier.rsEmpty verdictsFuture verifier wiring — stub
model_select.rsStatic fallbacksFuture registry merge — stub
pipeline_cache.rsAlways missFuture list_memories_by_typestub
pipeline.rsSession id 0, persistence commentsFuture vox-db methods — stub
CapabilityStatus
MCP vox_memory_searchShipped — handlers_memory.rs
MCP vox_research_runShipped (this workstream)
CLI vox research runShipped (this workstream)
CLI vox research evalShipped — eval.rs; golden queries extended

run_search_with_verification performs a full corpus retrieval pass (memory, chunks, repo, web…) and requires a SearchRuntimeContext. The orchestrator web_gather path intentionally uses WebSearchDispatcher for bounded web retrieval without requiring DB/memory paths on every research caller. A future bridge can attach SearchRuntimeContext from MCP ServerState when research is invoked server-side.


Phase 1 — Unblock web retrieval (done here)

Section titled “Phase 1 — Unblock web retrieval (done here)”
  • Implement gather_web_hits_for_plan using SearchPolicy::from_env() + WebSearchDispatcher::search.
  • Respect ResearchScope::Local (skip web) and ResearchQuery::site_scope (post-filter host).
  • Map HybridSearchHitResearchHit.
  • After initial subqueries, while hops remain, call CragRouter::expand_queries_from_partial_evidence / should_continue against average score vs target 0.75, capped by policy.web_search_max_hops.

Phase 3 — CLI + MCP + contracts (done here)

Section titled “Phase 3 — CLI + MCP + contracts (done here)”
  • vox research run <query> [--json] [--scope ...]
  • vox_research_run MCP tool returning JSON ResearchResult.
  • Operations catalog + MCP registry rows regenerated via vox ci operations-sync --target cli --write.

Phase 4 — Scientia + mesh (forward hooks)

Section titled “Phase 4 — Scientia + mesh (forward hooks)”

RiskMitigation
Web ToS / robotsHonor scraper_robots_txt_respect; prefer APIs for Wikipedia/arXiv when added
Tavily spendSession budget + fail-open behavior — Tavily SSOT
Secret leakageNever std::env::var("TAVILY_API_KEY") in consumers — secrets policy (.cursor/rules/secrets-policy.mdc)
Prompt injection from pagesTreat snippets as untrusted; truncate per policy
Non-deterministic CISmoke tests allow empty web hits offline; live test #[ignore]

CommandPurpose
cargo test -p vox-orchestratorUnit + integration smoke
cargo test -p vox-searchRetrieval regression
vox research run "..." --jsonManual end-to-end (needs network / keys)
vox research evalHarness writes metrics rows — extend golden queries in eval.rs
vox ci command-compliance / vox ci operations-verifyContract hygiene after catalog edits

  • Architecture doc at docs/src/architecture/deep-research-prior-art-and-vox-roadmap-2026.md
  • Tavily SSOT cited, not duplicated
  • Scientia finalization plan referenced
  • Stub inventory + vox-search mapping (§5)
  • research-index.md, where-things-live.md, search-retrieval-ssot-2026.md cross-links
  • CLI + MCP surfaces shipped
  • No new shell/Python automation

Captured 2026-05-11 (verify periodically; vendor URLs drift).

TopicURL
Gemini Deep Research (developers blog)https://blog.google/innovation-and-ai/technology/developers-tools/deep-research-agent-gemini-api/
Gemini Deep Research Max announcementhttps://blog.google/innovation-and-ai/models-and-research/gemini-models/next-generation-gemini-deep-research/
Gemini consumer overviewhttps://gemini.google/overview/deep-research/
OpenClaw docs (gateway — refresh list in CLI)https://openclawlab.com/en/docs/gateway/protocol/
OpenClaw Deep Research skill (tutorial mirror)https://openclaw.com/en/skills/deepresearchagent.html
SearchClaw repositoryhttps://github.com/RUC-NLPIR/SearchClaw
ScienceClaw repositoryhttps://github.com/Zaoqu-Liu/ScienceClaw
Anthropic news index (search “Research”)https://www.anthropic.com/news
Tavily docshttps://docs.tavily.com/
SearXNG projecthttps://github.com/searxng/searxng
DuckDuckGohttps://duckduckgo.com
Wikipedia APIhttps://www.mediawiki.org/wiki/API:Main_page
Wikidata APIhttps://www.wikidata.org/wiki/Wikidata:Data_access
arXiv APIhttps://info.arxiv.org/help/api/index.html
Crossref REST APIhttps://github.com/CrossRef/rest-api-doc
Semantic Scholar APIhttps://api.semanticscholar.org/
Internet Archive Waybackhttps://archive.org/help/wayback_api.php