CLAUDE.md: lean — commands, key conventions, fast tool guide, @ARCHITECTURE.md import routecheck/CLAUDE.md: purpose, access paths, env vars, gotchas openmemory/CLAUDE.md: tools, two Ollama instances, prompts, notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
42 lines
1.8 KiB
Markdown
42 lines
1.8 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
# Start all services
|
|
docker compose up --build
|
|
|
|
# Interactive CLI (requires services running)
|
|
docker compose --profile tools run --rm -it cli
|
|
|
|
# Integration tests (run from tests/integration/, requires all services)
|
|
python3 test_health.py
|
|
python3 test_memory.py [--name-only|--bench-only|--dedup-only]
|
|
python3 test_routing.py [--easy-only|--medium-only|--hard-only]
|
|
|
|
# Use case tests — read the .md file and follow its steps as Claude Code
|
|
# e.g.: read tests/use_cases/weather_now.md and execute it
|
|
```
|
|
|
|
## Key Conventions
|
|
|
|
- **Models via Bifrost only** — all LLM calls use `base_url=BIFROST_URL` with `ollama/<model>` prefix. Never call Ollama directly for inference.
|
|
- **One inference at a time** — `_reply_semaphore` serializes GPU use. Do not bypass it.
|
|
- **No tools in medium agent** — `_DirectModel` is a plain `ainvoke()` call. Context is injected via system prompt. `qwen3:4b` is unreliable with tool schemas.
|
|
- **Fast tools are pre-flight** — `FastToolRunner` runs before routing and before any LLM call. Results are injected as context, not returned to the user directly.
|
|
- **Memory outside agent loop** — `add_memory`/`search_memory` are called directly, never passed to agent tool lists.
|
|
- **Complex tier is opt-in** — `/think ` prefix only. LLM classification of "complex" is always downgraded to medium.
|
|
- **`.env` is required** — `TELEGRAM_BOT_TOKEN`, `ROUTECHECK_TOKEN`, `YANDEX_ROUTING_KEY`. Never commit it.
|
|
|
|
## Adding a Fast Tool
|
|
|
|
1. Subclass `FastTool` in `fast_tools.py` — implement `name`, `matches(message) → bool`, `run(message) → str`
|
|
2. Add instance to `_fast_tool_runner` list in `agent.py`
|
|
3. The router will automatically force medium tier when `matches()` returns true
|
|
|
|
## Architecture
|
|
|
|
@ARCHITECTURE.md
|