docs: ADR-0014 — unified Profile model + agent registry

Propose a shared substrate for per-user prefs, contexts, per-key
consents, and per-agent state so adding an agent stays a manifest
change. Updates CLAUDE.md, README, and architecture docs to reflect
the multi-agent pipeline (ADR-0013) and the registry direction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-05 10:19:07 +00:00
parent 41302d9f36
commit d454a0a8bf
7 changed files with 343 additions and 52 deletions

View File

@@ -69,7 +69,7 @@ docs/ architecture, adr, api
## AI stack
oO is AI-native: the recommender's job is to **rank**, not to write. An LLM generates candidate tips from the user's context; the bandit picks the best one.
oO is AI-native. Domain-specialized agents pre-compute snippets describing the user's state from one angle each; an orchestrator LLM reasons over the assembled snippets and produces one tip (ADR-0013). The orchestrator iterates a registry, not a hardcoded list (ADR-0014) — adding an agent is a manifest change, nothing else.
### Three-tier layout
@@ -79,25 +79,28 @@ oO is AI-native: the recommender's job is to **rank**, not to write. An LLM gene
| Routing | **LiteLLM** | Unified OpenAI-compatible API; model aliases; cloud fallback | `llm.alogins.net` (Agap shared) |
| Testing | **OpenWebUI** | Prompt iteration, model comparison, manual evals | `ai.alogins.net` (Agap shared) |
### Tip generation pipeline (Phase 2 target)
### Tip generation pipeline (ADR-0013, M2)
```
User signals ──▶ Context assembler ──▶ LiteLLM ──▶ Ollama (local)
(tasks, calendar, (ml/features/) (routing) or cloud fallback
patterns, time)
User signals Pre-compute agents (every 15 min)
(tasks, calendar, ──▶ ml/agents/{overdue-task, momentum, ──▶ agent_outputs
patterns, time) time-of-day, recent-patterns, (per-agent TTL)
focus-area, ...}
Eligibility filter: required consents + │
active context + per-user prefs (ADR-0014) ◀──┘
N typed TipCandidates
{content, kind, model,
prompt_version, confidence}
Orchestrator prompt (`v4-orchestrator`)
= global prefs + active context + snippets
Bandit policy (ml/serving)
scores + ranks candidates
LiteLLM ──▶ Ollama (local) / cloud fallback
Best tip shown
Tip shown to user
User reaction (done / snooze / dismiss + dwell)
Online bandit update + prompt_version tracking
Logged to tip_feedback for observability
(no online ML reward loop — see ADR-0013)
```
**Why LiteLLM as gateway:** All LLM calls use a single `LITELLM_URL` env var. Swapping from qwen2.5 to llama3.2, or routing a fraction to Claude for A/B, is a config change in LiteLLM — zero code change in oO. The model name in `tip_scores` tells you exactly which model produced each tip.
@@ -194,6 +197,20 @@ oO is ML-heavy. Without a cockpit, every model change ships blind. This console
### Phase 2 — AI tips + multi-source signals *(M2)* in progress
Goal: tips are AI-generated from user context, not just raw Todoist tasks. Multiple signal sources feed a generalized pipeline. Research-intensive milestone.
**Architectural shift (mid-M2):** the bandit-ranks-LLM-candidates design from earlier in M2 was replaced with a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM produces the tip directly. ADR-0014 layers a unified Profile + agent registry + auto-inference framework on top so the system generalizes cleanly to N agents.
**Multi-agent recommendation (ADR-0013, shipped):**
- [x] `agent_outputs` table + per-agent TTL caching
- [x] Five initial agents: `overdue-task`, `momentum`, `time-of-day`, `recent-patterns`, `focus-area`
- [x] Agent pre-compute scheduler
- [x] Orchestrator cutover — recommender calls `ml/serving` with snippet list, no bandit scoring
- [x] Bandit endpoints + shadow policy machinery removed
**Unified Profile + agent registry (ADR-0014, in progress):**
- [ ] Unified Profile model: prefs, contexts, consents + manifest plumbing + orchestrator cutover (#30)
- [ ] Shared context-inference framework (#111)
- [ ] Per-agent auto-inference: `time-of-day` (#112), `focus-area` (#113), `momentum` (#114), `overdue-task` (#115), `recent-patterns` (#116)
**AI infrastructure (unblock everything else):**
- [ ] `ai` compose profile — Ollama + LiteLLM for local dev; env vars `OLLAMA_URL` / `LITELLM_URL` (#86)
- [ ] AI gateway — wire `ml/serving` to LiteLLM; model aliases `tip-generator` + `embedder` (#87)