adolf/benchmarks/benchmark.json at main

Files

alvis 5b09a99a7f Routing: 100% accuracy on realistic home assistant dataset

- router.py: skip light reply generation when no_inference=True;
  add control words (да/нет/стоп/отмена/повтори/подожди/etc.) to _LIGHT_PATTERNS
- agent.py: pass no_inference to router.route(); skip preflight IO in no_inference mode
- benchmarks/benchmark.json: replace definition-heavy queries with realistic
  Alexa/Google-Home style queries (greetings, smart home, timers, shopping,
  weather, personal memory, cooking) — 30 light / 60 medium / 30 complex

Routing benchmark: 120/120 (100%), all under 0.1s per query

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-24 07:53:01 +00:00

18 KiB

Raw Permalink Blame History

View Raw

18 KiB Raw Permalink Blame History

18 KiB

Raw Permalink Blame History