Files

Alvis 957360f6ce Restructure CLAUDE.md per official Claude Code recommendations

CLAUDE.md: 178→25 lines — commands + @ARCHITECTURE.md import only

Rules split into .claude/rules/ (load at startup, topic-scoped):
  llm-inference.md  — Bifrost-only, semaphore, model name format, timeouts
  agent-pipeline.md — tier rules, no tools in medium, memory outside loop
  fast-tools.md     — extension guide (path-scoped: fast_tools.py + agent.py)
  secrets.md        — .env keys, Vaultwarden, no hardcoding

Path-scoped rule: fast-tools.md only loads when editing fast_tools.py or agent.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-13 07:19:09 +00:00

1.1 KiB

Raw Blame History

paths

fast_tools.py

agent.py

Fast Tools — Extension Guide

To add a new fast tool:

In fast_tools.py, subclass FastTool and implement:
- name (str property) — unique identifier, used in logs
- matches(message: str) -> bool — regex or logic; keep it cheap, runs on every message
- run(message: str) -> str — async fetch; return a short context block or "" on failure; never raise
In agent.py, add an instance to the _fast_tool_runner list (module level, after env vars are defined).
The router will automatically force medium tier when matches() returns true — no router changes needed.

Constraints

run() must return in under 15s — it runs in the pre-flight gather that blocks routing.
Return "" or a [tool error: ...] string on failure — never raise exceptions.
Keep returned context under ~1000 chars — larger contexts slow down qwen3:4b streaming significantly.
The deepagents container has no direct external internet. Use SearXNG (host.docker.internal:11437) or internal services.

1.1 KiB Raw Blame History

Fast Tools — Extension Guide

Constraints

1.1 KiB

Raw Blame History