Update docs: streaming, CLI container, use_cases tests
- /stream/{session_id} SSE endpoint replaces /reply/ for CLI
- Medium tier streams per-token via astream() with in_think filtering
- CLI now runs as Docker container (Dockerfile.cli, profile:tools)
- Correct medium model to qwen3:4b with real-time think block filtering
- Add use_cases/ test category to commands section
- Update files tree and services table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -18,7 +18,8 @@ Autonomous personal assistant with a multi-channel gateway. Three-tier model rou
|
||||
│ │ │ │
|
||||
│ │ POST /message │ ← all inbound │
|
||||
│ │ POST /chat (legacy) │ │
|
||||
│ │ GET /reply/{id} SSE │ ← CLI polling │
|
||||
│ │ GET /stream/{id} SSE │ ← token stream│
|
||||
│ │ GET /reply/{id} SSE │ ← legacy poll │
|
||||
│ │ GET /health │ │
|
||||
│ │ │ │
|
||||
│ │ channels.py registry │ │
|
||||
@@ -42,7 +43,7 @@ Autonomous personal assistant with a multi-channel gateway. Three-tier model rou
|
||||
| Channel | session_id | Inbound | Outbound |
|
||||
|---------|-----------|---------|---------|
|
||||
| Telegram | `tg-<chat_id>` | Grammy long-poll → POST /message | channels.py → POST grammy:3001/send |
|
||||
| CLI | `cli-<user>` | POST /message directly | GET /reply/{id} SSE stream |
|
||||
| CLI | `cli-<user>` | POST /message directly | GET /stream/{id} SSE — Rich Live streaming |
|
||||
| Voice | `voice-<device>` | (future) | (future) |
|
||||
|
||||
## Unified Message Flow
|
||||
@@ -58,11 +59,13 @@ Autonomous personal assistant with a multi-channel gateway. Three-tier model rou
|
||||
6. router.route() with enriched history (url_context + memories as system msgs)
|
||||
- if URL content fetched and tier=light → upgrade to medium
|
||||
7. Invoke agent for tier with url_context + memories in system prompt
|
||||
8. channels.deliver(session_id, channel, reply_text)
|
||||
- always puts reply in pending_replies[session_id] queue (for SSE)
|
||||
- calls channel-specific send callback
|
||||
9. _store_memory() background task — stores turn in openmemory
|
||||
10. GET /reply/{session_id} SSE clients receive the reply
|
||||
8. Token streaming:
|
||||
- medium: astream() pushes per-token chunks to _stream_queues[session_id]; <think> blocks filtered in real time
|
||||
- light/complex: full reply pushed as single chunk after completion
|
||||
- _end_stream() sends [DONE] sentinel
|
||||
9. channels.deliver(session_id, channel, reply_text) — Telegram callback
|
||||
10. _store_memory() background task — stores turn in openmemory
|
||||
11. GET /stream/{session_id} SSE clients receive chunks; CLI renders with Rich Live + final Markdown
|
||||
```
|
||||
|
||||
## Tool Handling
|
||||
@@ -132,15 +135,19 @@ Conversation history is keyed by session_id (5-turn buffer).
|
||||
|
||||
```
|
||||
adolf/
|
||||
├── docker-compose.yml Services: bifrost, deepagents, openmemory, grammy, crawl4ai
|
||||
├── docker-compose.yml Services: bifrost, deepagents, openmemory, grammy, crawl4ai, cli (profile:tools)
|
||||
├── Dockerfile deepagents container (Python 3.12)
|
||||
├── agent.py FastAPI gateway, run_agent_task, Crawl4AI pre-fetch, memory pipeline
|
||||
├── Dockerfile.cli CLI container (python:3.12-slim + rich)
|
||||
├── agent.py FastAPI gateway, run_agent_task, Crawl4AI pre-fetch, memory pipeline, /stream/ SSE
|
||||
├── channels.py Channel registry + deliver() + pending_replies
|
||||
├── router.py Router class — regex + LLM tier classification
|
||||
├── vram_manager.py VRAMManager — flush/prewarm/poll Ollama VRAM
|
||||
├── agent_factory.py _DirectModel (medium) / create_deep_agent (complex)
|
||||
├── cli.py Interactive CLI REPL client
|
||||
├── cli.py Interactive CLI REPL — Rich Live streaming + Markdown render
|
||||
├── wiki_research.py Batch wiki research pipeline (uses /message + SSE)
|
||||
├── tests/
|
||||
│ ├── integration/ Standalone integration test scripts (common.py + test_*.py)
|
||||
│ └── use_cases/ Claude Code skill markdown files — Claude acts as user + evaluator
|
||||
├── .env TELEGRAM_BOT_TOKEN (not committed)
|
||||
├── openmemory/
|
||||
│ ├── server.py FastMCP + mem0: add_memory, search_memory, get_all_memories
|
||||
|
||||
Reference in New Issue
Block a user