Files
adolf/agent.py
alvis 3fb90ae083 Skip _reply_semaphore in no_inference mode
No GPU inference happens in this mode, so serialization is not needed.
Without this, timed-out routing benchmark queries hold the semaphore
and cascade-block all subsequent queries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 07:40:07 +00:00

31 KiB