# Agent Pipeline Rules ## Tiers - Routing is fully automatic: router classifies into light/medium/complex via 3-way embedding similarity. - Complex tier is reached automatically for deep research queries — no prefix required. - Medium is the default tier. Light is only for trivial static-knowledge queries matched by regex or embedding. - Light tier upgrade to medium is automatic when URL content is pre-fetched or a fast tool matches. - `tier_override` API parameter still allows callers to force a specific tier (e.g. `adolf-deep` model → complex). ## Medium agent - `_DirectModel` makes a single `ainvoke()` call with no tool schema. Do not add tools to the medium agent. - `qwen3:4b` behaves unreliably when a tool array is present in the request — inject context via system prompt instead. ## Memory - `add_memory` and `search_memory` are called directly in `run_agent_task()`, outside the agent loop. - Never add memory tools to any agent's tool list. - Memory storage (`_store_memory`) runs as an asyncio background task after the semaphore is released. ## Fast tools - `FastToolRunner.run_matching()` runs in the pre-flight `asyncio.gather` alongside URL fetch and memory retrieval. - Fast tool results are injected as a system prompt block, not returned to the user directly. - When `any_matches()` is true, the router forces medium tier before LLM classification.