Routing: 100% accuracy on realistic home assistant dataset

- router.py: skip light reply generation when no_inference=True;
  add control words (да/нет/стоп/отмена/повтори/подожди/etc.) to _LIGHT_PATTERNS
- agent.py: pass no_inference to router.route(); skip preflight IO in no_inference mode
- benchmarks/benchmark.json: replace definition-heavy queries with realistic
  Alexa/Google-Home style queries (greetings, smart home, timers, shopping,
  weather, personal memory, cooking) — 30 light / 60 medium / 30 complex

Routing benchmark: 120/120 (100%), all under 0.1s per query

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-24 07:53:01 +00:00
parent 3fb90ae083
commit 5b09a99a7f
3 changed files with 148 additions and 4 deletions

View File

@@ -489,10 +489,10 @@ async def _run_agent_pipeline(
tier = tier_override
light_reply = None
if tier_override == "light":
tier, light_reply = await router.route(clean_message, enriched_history)
tier, light_reply = await router.route(clean_message, enriched_history, no_inference=no_inference)
tier = "light"
else:
tier, light_reply = await router.route(clean_message, enriched_history)
tier, light_reply = await router.route(clean_message, enriched_history, no_inference=no_inference)
if url_context and tier == "light":
tier = "medium"
light_reply = None