Commit Graph

3 Commits

Author SHA1 Message Date
77db739819 Rename --dry-run to --no-inference, apply to all tiers in run_benchmark.py
No-inference mode now skips LLM for all tiers (not just complex),
GPU check is auto-skipped, and the metadata key matches agent.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 03:49:09 +00:00
98095679be Fix benchmark log extraction: first tier match, increase log tail to 300
- Remove reversed() from extract_tier_from_logs: first match = routing decision
  (dry-run complex logs tier=complex early, then overwrites with tier=medium at done)
- Increase log tail from 80→300 to handle concurrent log activity

Fixes #7, #10

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 02:42:27 +00:00
Alvis
bd951f943f Move benchmark scripts into benchmarks/ subdir
- benchmarks/run_benchmark.py (was run_benchmark.py)
- benchmarks/run_voice_benchmark.py (was run_voice_benchmark.py)
- Scripts use Path(__file__).parent so paths resolve correctly in subdir
- .gitignore updated: ignore benchmarks/benchmark.json,
  results_latest.json, voice_results*.json, voice_audio/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 02:02:46 +00:00