Routing: 100% accuracy on realistic home assistant dataset

- router.py: skip light reply generation when no_inference=True;
  add control words (да/нет/стоп/отмена/повтори/подожди/etc.) to _LIGHT_PATTERNS
- agent.py: pass no_inference to router.route(); skip preflight IO in no_inference mode
- benchmarks/benchmark.json: replace definition-heavy queries with realistic
  Alexa/Google-Home style queries (greetings, smart home, timers, shopping,
  weather, personal memory, cooking) — 30 light / 60 medium / 30 complex

Routing benchmark: 120/120 (100%), all under 0.1s per query

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-24 07:53:01 +00:00
parent 3fb90ae083
commit 5b09a99a7f
3 changed files with 148 additions and 4 deletions

View File

@@ -52,6 +52,10 @@ _LIGHT_PATTERNS = re.compile(
r"|окей|хорошо|отлично|понятно|ок|ладно|договорились|спс|благодарю"
r"|пожалуйста|не за что|всё понятно|ясно"
r"|как дела|как ты|как жизнь|всё хорошо|всё ок"
# Assistant control words / confirmations
r"|да|нет|стоп|отмена|отменить|подожди|повтори|повторить|не нужно|не надо"
r"|слышишь\s+меня|ты\s+тут|отлично[,!]?\s+спасибо"
r"|yes|no|stop|cancel|wait|repeat"
# Russian tech definitions — static knowledge (no tools needed)
r"|что\s+такое\s+\S+"
r"|что\s+означает\s+\S+"
@@ -422,10 +426,11 @@ class Router:
self,
message: str,
history: list[dict],
no_inference: bool = False,
) -> tuple[str, Optional[str]]:
"""
Returns (tier, reply_or_None).
For light tier: also generates the reply inline.
For light tier: also generates the reply inline (unless no_inference=True).
For medium/complex: reply is None.
"""
if self._fast_tool_runner and self._fast_tool_runner.any_matches(message.strip()):
@@ -435,6 +440,8 @@ class Router:
if _LIGHT_PATTERNS.match(message.strip()):
print("[router] regex→light", flush=True)
if no_inference:
return "light", None
return await self._generate_light_reply(message, history)
if _COMPLEX_PATTERNS.search(message.strip()):
@@ -447,7 +454,7 @@ class Router:
tier = await self._classify_by_embedding(message)
if tier != "light":
if tier != "light" or no_inference:
return tier, None
return await self._generate_light_reply(message, history)