docs: ADR-0014 — unified Profile model + agent registry

Propose a shared substrate for per-user prefs, contexts, per-key
consents, and per-agent state so adding an agent stays a manifest
change. Updates CLAUDE.md, README, and architecture docs to reflect
the multi-agent pipeline (ADR-0013) and the registry direction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-05 10:19:07 +00:00
parent 41302d9f36
commit d454a0a8bf
7 changed files with 343 additions and 52 deletions

View File

@@ -78,7 +78,7 @@ docs/ architecture notes, ADRs, API specs
## AI stack
oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM assembles them into one tip. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
| Alias | Model | Used by |
|-------|-------|---------|
@@ -90,33 +90,37 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos
Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.
**LLM tip generation pipeline:**
1. `ml/features/context.py` assembles user signals → structured prompt context
2. `POST /generate` in `ml/serving` calls LiteLLM → returns `TipCandidate[]`
3. Bandit policy in `ml/serving` scores + ranks candidates
4. Best candidate returned as tip; reaction closes the online reward loop
**Multi-agent tip generation pipeline (ADR-0013):**
1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
3. `POST /recommend` in `ml/serving` assembles the orchestrator prompt (`v4-orchestrator`) and calls LiteLLM via the `tip-generator` alias
4. Returned tip is logged in `tip_scores` with the contributing agent set; reaction is logged for observability (no bandit reward loop)
## Current phase
**M1 shipped (core + admin). M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.
Recent completions (M1 add-on):
- ADR-0012ε-greedy v2 promotion (profile features, D=12) — 2026-04-26
- Offline sim framework + MLflow integration — shipped in M1 add-on
- Token-based admin auth for Playwright/CI — secured auth boundary
Recent completions:
- ADR-0013multi-agent recommendation: pre-computed agent snippets + orchestrator LLM (replaces ε-greedy bandit) — 2026-05-01
- LLM context assembler + tip generation scaffold (#79, #88)
- Model benchmarking for tip generation (#93, #95)
- Admin UX refinements: feedback consolidation, settings placement (#100102)
- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
Active work (M2):
- ADR-0014 (proposed) — unified Profile model + agent registry + inference framework
- Unified Profile model: prefs, contexts, consents, registry plumbing, orchestrator cutover (#30)
- Shared context-inference framework for agents (#111)
- Per-agent auto-inference: time-of-day (#112), focus-area (#113), momentum (#114), overdue-task (#115), recent-patterns (#116)
- Signal abstraction for multi-source support (#78)
- Per-user feature freshness SLAs (#61, ADR-0011 phase B)
- LLM context assembler + tip generation scaffold (#79, #88)
- Model benchmarking for tip generation (#93)
- Admin UX refinements: feedback consolidation, settings placement (#100102)
## What NOT to do
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
- Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (bandit, LLM, hybrid), keep contract.
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (multi-agent orchestrator today, future LLM/hybrid variants), keep contract.
- Don't hardcode the agent list. The orchestrator is registry-driven (ADR-0014); adding/removing an agent is a manifest change in `ml/agents/<id>/`, never a recommender edit.
- Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
- Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name.