docs: ADR-0014 — unified Profile model + agent registry

Propose a shared substrate for per-user prefs, contexts, per-key consents, and per-agent state so adding an agent stays a manifest change. Updates CLAUDE.md, README, and architecture docs to reflect the multi-agent pipeline (ADR-0013) and the registry direction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 10:19:07 +00:00
parent 41302d9f36
commit d454a0a8bf
7 changed files with 343 additions and 52 deletions
--- a/README.md
+++ b/README.md
@@ -69,7 +69,7 @@ docs/        architecture, adr, api

 ## AI stack

-oO is AI-native: the recommender's job is to **rank**, not to write. An LLM generates candidate tips from the user's context; the bandit picks the best one.
+oO is AI-native. Domain-specialized agents pre-compute snippets describing the user's state from one angle each; an orchestrator LLM reasons over the assembled snippets and produces one tip (ADR-0013). The orchestrator iterates a registry, not a hardcoded list (ADR-0014) — adding an agent is a manifest change, nothing else.

 ### Three-tier layout

@@ -79,25 +79,28 @@ oO is AI-native: the recommender's job is to **rank**, not to write. An LLM gene
 | Routing | **LiteLLM** | Unified OpenAI-compatible API; model aliases; cloud fallback | `llm.alogins.net` (Agap shared) |
 | Testing | **OpenWebUI** | Prompt iteration, model comparison, manual evals | `ai.alogins.net` (Agap shared) |

-### Tip generation pipeline (Phase 2 target)
+### Tip generation pipeline (ADR-0013, M2)

 ```
-User signals  ──▶  Context assembler  ──▶  LiteLLM  ──▶  Ollama (local)
-(tasks, calendar,    (ml/features/)         (routing)     or cloud fallback
- patterns, time)
+User signals          Pre-compute agents (every 15 min)
+(tasks, calendar,  ──▶ ml/agents/{overdue-task, momentum,        ──▶  agent_outputs
+ patterns, time)        time-of-day, recent-patterns,                 (per-agent TTL)
+                        focus-area, ...}
+                                                                            │
+                              Eligibility filter: required consents +       │
+                              active context + per-user prefs (ADR-0014) ◀──┘
                                                ▼
-                                     N typed TipCandidates
-                                     {content, kind, model,
-                                      prompt_version, confidence}
+                                  Orchestrator prompt (`v4-orchestrator`)
+                                  = global prefs + active context + snippets
                                                ▼
-                                    Bandit policy (ml/serving)
-                                    scores + ranks candidates
+                                    LiteLLM ──▶ Ollama (local) / cloud fallback
                                                ▼
-                                         Best tip shown
+                                         Tip shown to user
                                                ▼
                              User reaction (done / snooze / dismiss + dwell)
                                                ▼
-                              Online bandit update + prompt_version tracking
+                              Logged to tip_feedback for observability
+                              (no online ML reward loop — see ADR-0013)
 ```

 **Why LiteLLM as gateway:**  All LLM calls use a single `LITELLM_URL` env var. Swapping from qwen2.5 to llama3.2, or routing a fraction to Claude for A/B, is a config change in LiteLLM — zero code change in oO. The model name in `tip_scores` tells you exactly which model produced each tip.
@@ -194,6 +197,20 @@ oO is ML-heavy. Without a cockpit, every model change ships blind. This console
 ### Phase 2 — AI tips + multi-source signals  *(M2)* in progress
 Goal: tips are AI-generated from user context, not just raw Todoist tasks. Multiple signal sources feed a generalized pipeline. Research-intensive milestone.

+**Architectural shift (mid-M2):** the bandit-ranks-LLM-candidates design from earlier in M2 was replaced with a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM produces the tip directly. ADR-0014 layers a unified Profile + agent registry + auto-inference framework on top so the system generalizes cleanly to N agents.
+
+**Multi-agent recommendation (ADR-0013, shipped):**
+- [x] `agent_outputs` table + per-agent TTL caching
+- [x] Five initial agents: `overdue-task`, `momentum`, `time-of-day`, `recent-patterns`, `focus-area`
+- [x] Agent pre-compute scheduler
+- [x] Orchestrator cutover — recommender calls `ml/serving` with snippet list, no bandit scoring
+- [x] Bandit endpoints + shadow policy machinery removed
+
+**Unified Profile + agent registry (ADR-0014, in progress):**
+- [ ] Unified Profile model: prefs, contexts, consents + manifest plumbing + orchestrator cutover (#30)
+- [ ] Shared context-inference framework (#111)
+- [ ] Per-agent auto-inference: `time-of-day` (#112), `focus-area` (#113), `momentum` (#114), `overdue-task` (#115), `recent-patterns` (#116)
+
 **AI infrastructure (unblock everything else):**
 - [ ] `ai` compose profile — Ollama + LiteLLM for local dev; env vars `OLLAMA_URL` / `LITELLM_URL` (#86)
 - [ ] AI gateway — wire `ml/serving` to LiteLLM; model aliases `tip-generator` + `embedder` (#87)