# Agent Pipeline Rules

## Tiers
- Routing is fully automatic: router classifies into light/medium/complex via 3-way embedding similarity.
- Complex tier is reached automatically for deep research queries — no prefix required.
- Medium is the default tier. Light is only for trivial static-knowledge queries matched by regex or embedding.
- Light tier upgrade to medium is automatic when URL content is pre-fetched or a fast tool matches.
- `tier_override` API parameter still allows callers to force a specific tier (e.g. `adolf-deep` model → complex).

## Medium agent
- `_DirectModel` makes a single `ainvoke()` call with no tool schema. Do not add tools to the medium agent.
- `qwen3:4b` behaves unreliably when a tool array is present in the request — inject context via system prompt instead.

## Memory
- `add_memory` and `search_memory` are called directly in `run_agent_task()`, outside the agent loop.
- Never add memory tools to any agent's tool list.
- Memory storage (`_store_memory`) runs as an asyncio background task after the semaphore is released.

## Fast tools
- `FastToolRunner.run_matching()` runs in the pre-flight `asyncio.gather` alongside URL fetch and memory retrieval.
- Fast tool results are injected as a system prompt block, not returned to the user directly.
- When `any_matches()` is true, the router forces medium tier before LLM classification.