CLAUDE.md: lean — commands, key conventions, fast tool guide, @ARCHITECTURE.md import routecheck/CLAUDE.md: purpose, access paths, env vars, gotchas openmemory/CLAUDE.md: tools, two Ollama instances, prompts, notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1.8 KiB
1.8 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Commands
# Start all services
docker compose up --build
# Interactive CLI (requires services running)
docker compose --profile tools run --rm -it cli
# Integration tests (run from tests/integration/, requires all services)
python3 test_health.py
python3 test_memory.py [--name-only|--bench-only|--dedup-only]
python3 test_routing.py [--easy-only|--medium-only|--hard-only]
# Use case tests — read the .md file and follow its steps as Claude Code
# e.g.: read tests/use_cases/weather_now.md and execute it
Key Conventions
- Models via Bifrost only — all LLM calls use
base_url=BIFROST_URLwithollama/<model>prefix. Never call Ollama directly for inference. - One inference at a time —
_reply_semaphoreserializes GPU use. Do not bypass it. - No tools in medium agent —
_DirectModelis a plainainvoke()call. Context is injected via system prompt.qwen3:4bis unreliable with tool schemas. - Fast tools are pre-flight —
FastToolRunnerruns before routing and before any LLM call. Results are injected as context, not returned to the user directly. - Memory outside agent loop —
add_memory/search_memoryare called directly, never passed to agent tool lists. - Complex tier is opt-in —
/thinkprefix only. LLM classification of "complex" is always downgraded to medium. .envis required —TELEGRAM_BOT_TOKEN,ROUTECHECK_TOKEN,YANDEX_ROUTING_KEY. Never commit it.
Adding a Fast Tool
- Subclass
FastToolinfast_tools.py— implementname,matches(message) → bool,run(message) → str - Add instance to
_fast_tool_runnerlist inagent.py - The router will automatically force medium tier when
matches()returns true
Architecture
@ARCHITECTURE.md