docs: ADR-0014 — unified Profile model + agent registry

Propose a shared substrate for per-user prefs, contexts, per-key consents, and per-agent state so adding an agent stays a manifest change. Updates CLAUDE.md, README, and architecture docs to reflect the multi-agent pipeline (ADR-0013) and the registry direction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 10:19:07 +00:00
parent 41302d9f36
commit d454a0a8bf
7 changed files with 343 additions and 52 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -78,7 +78,7 @@ docs/              architecture notes, ADRs, API specs

 ## AI stack

-oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
+oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM assembles them into one tip. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.

 | Alias | Model | Used by |
 |-------|-------|---------|
@@ -90,33 +90,37 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos

 Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.

-**LLM tip generation pipeline:**
-1. `ml/features/context.py` assembles user signals → structured prompt context
-2. `POST /generate` in `ml/serving` calls LiteLLM → returns `TipCandidate[]`
-3. Bandit policy in `ml/serving` scores + ranks candidates
-4. Best candidate returned as tip; reaction closes the online reward loop
+**Multi-agent tip generation pipeline (ADR-0013):**
+1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
+2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
+3. `POST /recommend` in `ml/serving` assembles the orchestrator prompt (`v4-orchestrator`) and calls LiteLLM via the `tip-generator` alias
+4. Returned tip is logged in `tip_scores` with the contributing agent set; reaction is logged for observability (no bandit reward loop)

 ## Current phase

 **M1 shipped (core + admin). M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.

-Recent completions (M1 add-on):
- ADR-0012 — ε-greedy v2 promotion (profile features, D=12) — 2026-04-26
- Offline sim framework + MLflow integration — shipped in M1 add-on
- Token-based admin auth for Playwright/CI — secured auth boundary
+Recent completions:
+- ADR-0013 — multi-agent recommendation: pre-computed agent snippets + orchestrator LLM (replaces ε-greedy bandit) — 2026-05-01
+- LLM context assembler + tip generation scaffold (#79, #88)
+- Model benchmarking for tip generation (#93, #95)
+- Admin UX refinements: feedback consolidation, settings placement (#100–102)
+- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)

 Active work (M2):
+- ADR-0014 (proposed) — unified Profile model + agent registry + inference framework
+- Unified Profile model: prefs, contexts, consents, registry plumbing, orchestrator cutover (#30)
+- Shared context-inference framework for agents (#111)
+- Per-agent auto-inference: time-of-day (#112), focus-area (#113), momentum (#114), overdue-task (#115), recent-patterns (#116)
 - Signal abstraction for multi-source support (#78)
 - Per-user feature freshness SLAs (#61, ADR-0011 phase B)
- LLM context assembler + tip generation scaffold (#79, #88)
- Model benchmarking for tip generation (#93)
- Admin UX refinements: feedback consolidation, settings placement (#100–102)

 ## What NOT to do

 - Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
 - Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (bandit, LLM, hybrid), keep contract.
+- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (multi-agent orchestrator today, future LLM/hybrid variants), keep contract.
+- Don't hardcode the agent list. The orchestrator is registry-driven (ADR-0014); adding/removing an agent is a manifest change in `ml/agents/<id>/`, never a recommender edit.
 - Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
 - Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
 - Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name.