Embed Crawl4AI at all tiers, restore qwen3:4b medium, update docs

- Pre-routing URL fetch: any message with URLs gets content fetched async (httpx.AsyncClient) before routing via _fetch_urls_from_message() - URL context and memories gathered concurrently with asyncio.gather - Light tier upgraded to medium when URL content is present - url_context injected into system prompt for medium and complex agents - Complex agent retains web_search/fetch_url tools + receives pre-fetched content - Medium model restored to qwen3:4b (was temporarily qwen2.5:1.5b) - Unit tests added for _extract_urls - ARCHITECTURE.md: added Tool Handling, Crawl4AI Integration, Memory Pipeline sections - CLAUDE.md: updated request flow and Crawl4AI integration docs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 15:49:34 +00:00
parent f9618a9bbf
commit 50097d6092
8 changed files with 183 additions and 31 deletions
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -25,7 +25,7 @@ services:
      - BIFROST_URL=http://bifrost:8080/v1
      # Direct Ollama GPU URL — used only by VRAMManager for flush/prewarm
      - OLLAMA_BASE_URL=http://host.docker.internal:11436
-      - DEEPAGENTS_MODEL=qwen2.5:1.5b
+      - DEEPAGENTS_MODEL=qwen3:4b
      - DEEPAGENTS_COMPLEX_MODEL=qwen3:8b
      - DEEPAGENTS_ROUTER_MODEL=qwen2.5:1.5b
      - SEARXNG_URL=http://host.docker.internal:11437