adolf

9 Commits 3 Branches 0 Tags

Author	SHA1	Message	Date
Alvis	a35ba83db7	Add use_cases test category with CLI startup test tests/use_cases/ holds scenario-driven tests run by the Claude Code agent, which acts as both the test runner and mock user. Each test prints a structured transcript; Claude evaluates correctness. First test: test_cli_startup.py — spawns cli.py with a subprocess, reads the welcome banner, sends EOF, and verifies exit code 0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 16:10:04 +00:00
Alvis	021104f510	Split monolithic test_pipeline.py into focused integration test scripts - common.py: shared config, URL constants, benchmark questions, all helpers (get, post_json, check_sse, qdrant_count, fetch_logs, parse_run_block, wait_for, etc.) - test_health.py: service health checks (deepagents, bifrost, GPU/CPU Ollama, Qdrant, SearXNG) - test_memory.py: name store/recall pipeline, memory benchmark (5 facts + 10 recalls), dedup test - test_routing.py: easy/medium/hard tier routing benchmarks with --easy/medium/hard-only flags - Removed test_pipeline.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 16:02:57 +00:00
Alvis	50097d6092	Embed Crawl4AI at all tiers, restore qwen3:4b medium, update docs - Pre-routing URL fetch: any message with URLs gets content fetched async (httpx.AsyncClient) before routing via _fetch_urls_from_message() - URL context and memories gathered concurrently with asyncio.gather - Light tier upgraded to medium when URL content is present - url_context injected into system prompt for medium and complex agents - Complex agent retains web_search/fetch_url tools + receives pre-fetched content - Medium model restored to qwen3:4b (was temporarily qwen2.5:1.5b) - Unit tests added for _extract_urls - ARCHITECTURE.md: added Tool Handling, Crawl4AI Integration, Memory Pipeline sections - CLAUDE.md: updated request flow and Crawl4AI integration docs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 15:49:34 +00:00
Alvis	f9618a9bbf	Integrate Bifrost LLM gateway, add test suite, implement memory pipeline - Add Bifrost (maximhq/bifrost) as LLM gateway: all inference routes through bifrost:8080/v1 with retry logic and observability; VRAMManager keeps direct Ollama access for VRAM flush/prewarm operations - Switch medium model from qwen3:4b to qwen2.5:1.5b (direct call, no tools) via _DirectModel wrapper; complex keeps create_deep_agent with qwen3:8b - Implement out-of-agent memory pipeline: _retrieve_memories pre-fetches relevant context (injected into all tiers), _store_memory runs as background task after each reply writing to openmemory/Qdrant - Add tests/unit/ with 133 tests covering router, channels, vram_manager, agent helpers; move integration test to tests/integration/ - Add bifrost-config.json with GPU Ollama (qwen2.5:0.5b/1.5b, qwen3:4b/8b, gemma3:4b) and CPU Ollama providers - Integration test 28/29 pass (only grammy fails — no TELEGRAM_BOT_TOKEN) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 13:50:12 +00:00
Alvis	ec45d255f0	wiki search people tested pipeline	2026-03-05 11:22:34 +00:00
Alvis	ea77b2308b	Add three-tier model routing with VRAM management and benchmark suite - Three-tier routing: light (router answers directly ~3s), medium (qwen3:4b + tools ~60s), complex (/think prefix → qwen3:8b + subagents ~140s) - Router: qwen2.5:1.5b, temp=0, regex pre-classifier + raw-text LLM classify - VRAMManager: explicit flush/poll/prewarm to prevent Ollama CPU-spill bug - agent_factory: build_medium_agent and build_complex_agent using deepagents (TodoListMiddleware + SubAgentMiddleware with research/memory subagents) - Fix: split Telegram replies >4000 chars into multiple messages - Benchmark: 30 questions (easy/medium/hard) — 10/10/10 verified passing easy→light, medium→medium, hard→complex with VRAM flush confirmed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-28 17:54:51 +00:00
Alvis	1718d70203	Fix system prompt: agent now correctly handles memory requests - Tell agent that memory is saved automatically after every reply - Instruct agent to never say it cannot store information - Instruct agent to acknowledge and confirm when user asks to remember something - Fix misleading startup log (gemma3:1b → qwen2.5:1.5b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:22:08 +00:00
Alvis	19e2c27976	Switch extraction model to qwen2.5:1.5b, fix mem0migrations dims, update tests - openmemory: use qwen2.5:1.5b instead of gemma3:1b for fact extraction - test_pipeline.py: check qwen2.5:1.5b, fix SSE checks, fix Qdrant payload parsing, relax SearXNG threshold to 5s, improve marker word test - potential-directions.md: ranked CPU extraction model candidates - Root cause: mem0migrations collection had stale 1536-dim vectors causing silent dedup failures; recreate both collections at 768 dims All 18 pipeline tests now pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:11:29 +00:00
Alvis	66ab93aa37	Add Adolf architecture doc and integration test script - ARCHITECTURE.md: comprehensive pipeline description (copied from Gitea wiki) - test_pipeline.py: tests all services, memory, async timing, and recall Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 04:52:40 +00:00