Routing: 100% accuracy on realistic home assistant dataset

- router.py: skip light reply generation when no_inference=True; add control words (да/нет/стоп/отмена/повтори/подожди/etc.) to _LIGHT_PATTERNS - agent.py: pass no_inference to router.route(); skip preflight IO in no_inference mode - benchmarks/benchmark.json: replace definition-heavy queries with realistic Alexa/Google-Home style queries (greetings, smart home, timers, shopping, weather, personal memory, cooking) — 30 light / 60 medium / 30 complex Routing benchmark: 120/120 (100%), all under 0.1s per query Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 07:53:01 +00:00
parent 3fb90ae083
commit 5b09a99a7f
3 changed files with 148 additions and 4 deletions
--- a/agent.py
+++ b/agent.py
@@ -489,10 +489,10 @@ async def _run_agent_pipeline(
                    tier = tier_override
                    light_reply = None
                    if tier_override == "light":
-                        tier, light_reply = await router.route(clean_message, enriched_history)
+                        tier, light_reply = await router.route(clean_message, enriched_history, no_inference=no_inference)
                        tier = "light"
                else:
-                    tier, light_reply = await router.route(clean_message, enriched_history)
+                    tier, light_reply = await router.route(clean_message, enriched_history, no_inference=no_inference)
                    if url_context and tier == "light":
                        tier = "medium"
                        light_reply = None