• Joined on 2025-12-21
alvis pushed to main at alvis/adolf 2026-03-24 02:43:26 +00:00
0b428e4ada Merge pull request 'Fix benchmark log extraction: first tier match, increase log tail to 300' (#12) from fix/benchmark-log-extraction into main
98095679be Fix benchmark log extraction: first tier match, increase log tail to 300
Compare 2 commits »
alvis merged pull request alvis/adolf#12 2026-03-24 02:43:26 +00:00
Fix benchmark log extraction: first tier match, increase log tail to 300
alvis closed issue alvis/adolf#7 2026-03-24 02:43:26 +00:00
Benchmark: ~50 queries return "?" due to tier= log extraction timeout
alvis created branch fix/benchmark-log-extraction in alvis/adolf 2026-03-24 02:42:29 +00:00
alvis pushed to fix/benchmark-log-extraction at alvis/adolf 2026-03-24 02:42:29 +00:00
98095679be Fix benchmark log extraction: first tier match, increase log tail to 300
alvis created pull request alvis/adolf#12 2026-03-24 02:42:27 +00:00
Fix benchmark log extraction: first tier match, increase log tail to 300
alvis pushed to fix/tier-logging at alvis/adolf 2026-03-24 02:42:02 +00:00
8ef4897869 Fix tier logging: capture actual_tier, fix parse_run_block regex, remove reply_text truncation
alvis created branch fix/tier-logging in alvis/adolf 2026-03-24 02:42:02 +00:00
alvis created pull request alvis/adolf#11 2026-03-24 02:42:00 +00:00
Fix tier logging: capture actual_tier, fix parse_run_block regex, remove reply_text truncation
alvis pushed to main at alvis/adolf 2026-03-24 02:14:14 +00:00
1f5e272600 Switch from Bifrost to LiteLLM; add Matrix channel; update rules
alvis pushed to main at alvis/adolf 2026-03-24 02:13:15 +00:00
54cb940279 Update docs: add benchmarks/ section, fix complex tier description
alvis pushed to main at alvis/adolf 2026-03-24 02:02:47 +00:00
bd951f943f Move benchmark scripts into benchmarks/ subdir
alvis pushed to main at alvis/adolf 2026-03-24 02:00:22 +00:00
ab68bba935 Add routing benchmark scripts; gitignore dataset and results
3ae1cefbd4 WeatherTool: fetch open-meteo directly, skip LLM for fast tool replies
Compare 2 commits »
alvis opened issue alvis/adolf#10 2026-03-24 01:58:46 +00:00
Benchmark: complex tier never triggered — 0% accuracy (40 queries)
alvis opened issue alvis/adolf#9 2026-03-24 01:58:18 +00:00
Benchmark: smart home commands (medium) mis-routed to light
alvis opened issue alvis/adolf#8 2026-03-24 01:58:05 +00:00
Benchmark: light tier over-classified as medium (tech definition queries)
alvis opened issue alvis/adolf#7 2026-03-24 01:57:56 +00:00
Benchmark: ~50 queries return "?" due to tier= log extraction timeout
alvis opened issue alvis/adolf#3 2026-03-24 01:56:50 +00:00
Fix reply_text[:200] truncation breaking bench keyword matching
alvis opened issue alvis/adolf#2 2026-03-24 01:56:50 +00:00
Verify [agent] running: log anchor still emitted in new agent.py
alvis opened issue alvis/adolf#1 2026-03-24 01:56:50 +00:00
Fix parse_run_block regex to match new log format