docs: ADR-0014 — unified Profile model + agent registry
Propose a shared substrate for per-user prefs, contexts, per-key consents, and per-agent state so adding an agent stays a manifest change. Updates CLAUDE.md, README, and architecture docs to reflect the multi-agent pipeline (ADR-0013) and the registry direction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
32
CLAUDE.md
32
CLAUDE.md
@@ -78,7 +78,7 @@ docs/ architecture notes, ADRs, API specs
|
||||
|
||||
## AI stack
|
||||
|
||||
oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
|
||||
oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM assembles them into one tip. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
|
||||
|
||||
| Alias | Model | Used by |
|
||||
|-------|-------|---------|
|
||||
@@ -90,33 +90,37 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos
|
||||
|
||||
Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.
|
||||
|
||||
**LLM tip generation pipeline:**
|
||||
1. `ml/features/context.py` assembles user signals → structured prompt context
|
||||
2. `POST /generate` in `ml/serving` calls LiteLLM → returns `TipCandidate[]`
|
||||
3. Bandit policy in `ml/serving` scores + ranks candidates
|
||||
4. Best candidate returned as tip; reaction closes the online reward loop
|
||||
**Multi-agent tip generation pipeline (ADR-0013):**
|
||||
1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
|
||||
2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
|
||||
3. `POST /recommend` in `ml/serving` assembles the orchestrator prompt (`v4-orchestrator`) and calls LiteLLM via the `tip-generator` alias
|
||||
4. Returned tip is logged in `tip_scores` with the contributing agent set; reaction is logged for observability (no bandit reward loop)
|
||||
|
||||
## Current phase
|
||||
|
||||
**M1 shipped (core + admin). M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.
|
||||
|
||||
Recent completions (M1 add-on):
|
||||
- ADR-0012 — ε-greedy v2 promotion (profile features, D=12) — 2026-04-26
|
||||
- Offline sim framework + MLflow integration — shipped in M1 add-on
|
||||
- Token-based admin auth for Playwright/CI — secured auth boundary
|
||||
Recent completions:
|
||||
- ADR-0013 — multi-agent recommendation: pre-computed agent snippets + orchestrator LLM (replaces ε-greedy bandit) — 2026-05-01
|
||||
- LLM context assembler + tip generation scaffold (#79, #88)
|
||||
- Model benchmarking for tip generation (#93, #95)
|
||||
- Admin UX refinements: feedback consolidation, settings placement (#100–102)
|
||||
- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
|
||||
|
||||
Active work (M2):
|
||||
- ADR-0014 (proposed) — unified Profile model + agent registry + inference framework
|
||||
- Unified Profile model: prefs, contexts, consents, registry plumbing, orchestrator cutover (#30)
|
||||
- Shared context-inference framework for agents (#111)
|
||||
- Per-agent auto-inference: time-of-day (#112), focus-area (#113), momentum (#114), overdue-task (#115), recent-patterns (#116)
|
||||
- Signal abstraction for multi-source support (#78)
|
||||
- Per-user feature freshness SLAs (#61, ADR-0011 phase B)
|
||||
- LLM context assembler + tip generation scaffold (#79, #88)
|
||||
- Model benchmarking for tip generation (#93)
|
||||
- Admin UX refinements: feedback consolidation, settings placement (#100–102)
|
||||
|
||||
## What NOT to do
|
||||
|
||||
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
|
||||
- Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
|
||||
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (bandit, LLM, hybrid), keep contract.
|
||||
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (multi-agent orchestrator today, future LLM/hybrid variants), keep contract.
|
||||
- Don't hardcode the agent list. The orchestrator is registry-driven (ADR-0014); adding/removing an agent is a manifest change in `ml/agents/<id>/`, never a recommender edit.
|
||||
- Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
|
||||
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
|
||||
- Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name.
|
||||
|
||||
Reference in New Issue
Block a user