# ADR-0014 — Unified Profile model + agent registry **Status:** Proposed **Date:** 2026-05-05 **Issues:** #30, #111, #112, #113, #114, #115, #116 **Supersedes (data model):** ADR-0013 (the agent set stands; this ADR replaces the implicit assumption that prefs/contexts/consents are hardcoded on `users`). ## Context ADR-0013 introduced the multi-agent pipeline: N pre-compute agents emit prompt snippets, an orchestrator LLM assembles them into a tip. The ADR specified the `agent_outputs` table and the orchestrator contract, but left several questions open: 1. **Where do user preferences live?** `users.consentGiven` is a single boolean. There is no place for quiet hours, tone, allowed tip kinds, or per-integration consent. Each new preference would mean another typed column on `users` — and worse, every new agent needs its own tunable parameters (focus areas, momentum baseline, lateness tolerance) that are clearly per-agent state, not global user state. 2. **How are agents discovered?** The orchestrator currently iterates a hardcoded list. Adding an agent means touching the recommender, the admin UI, and the prefs schema in three places. 3. **How does context (work / home / vacation) interact with agents?** Some agents should be silenced in some contexts. There is no model. 4. **How is per-user agent configuration learned?** Issues #112–#116 each want to auto-infer parameters (quiet hours, focus areas, etc.) from history. Without a shared substrate they each reinvent storage, recompute cadence, and cold-start fallback. The current ADR-0013 design works for five agents. It will not work for twenty without becoming a tangle. ## Decision Three changes, designed to compose: ### 1. Agents are plugins with declared schemas Every agent ships a manifest (Python, lives next to its code in `ml/agents//manifest.py`): ```python class AgentManifest: id: str # 'time-of-day' version: str # bump invalidates cached outputs + inferences pref_schema: dict # JSON Schema for user-tunable knobs context_schema: list[str] # signals it reads, e.g. ['todoist.tasks'] required_consents: list[str] # ['data:todoist', 'agent:time-of-day'] output_contract: dict # snippet shape (free text + optional tags) ttl_sec: int # snippet freshness for agent_outputs inferred_params: list[InferredParam] # see §3 ``` The manifest is the **single point of registration**. The orchestrator, admin UI, and inference framework all read from it. Adding an agent is adding one directory in `ml/agents/` — no edits elsewhere. A `GET /api/agents/registry` endpoint (TS recommender → Python proxy) exposes manifests so the admin app can auto-render configuration UI from each `pref_schema`. ### 2. Unified Profile data model Three new tables replace the implicit "fields-on-users" pattern. `users.consentGiven` collapses into `user_consents` (one row, `consent_key='data:core'`); existing data migrates in a single backfill. ```sql -- Hybrid: typed columns where stable, KV where open-ended. -- Stable globals stay on users (added in this ADR): ALTER TABLE users ADD COLUMN tone TEXT; -- 'direct'|'gentle'|'motivational' ALTER TABLE users ADD COLUMN tip_kinds_json TEXT; -- JSON: allowed tip kinds -- Open-ended per-agent prefs land here: CREATE TABLE user_preferences ( user_id TEXT NOT NULL REFERENCES users(id), scope TEXT NOT NULL, -- 'orchestrator' | 'agent:' key TEXT NOT NULL, -- e.g. 'quietStart', 'focusAreas' value_json TEXT NOT NULL, -- agent validates against its pref_schema on read updated_at TEXT NOT NULL, source TEXT NOT NULL DEFAULT 'user', -- 'user' | 'inferred' PRIMARY KEY (user_id, scope, key) ); CREATE TABLE user_consents ( user_id TEXT NOT NULL REFERENCES users(id), consent_key TEXT NOT NULL, -- 'data:todoist' | 'data:calendar' | 'agent:focus-area' granted_at TEXT NOT NULL, revoked_at TEXT, -- null = currently active PRIMARY KEY (user_id, consent_key) ); CREATE TABLE user_contexts ( user_id TEXT NOT NULL REFERENCES users(id), name TEXT NOT NULL, -- 'work' | 'home' | 'vacation' | user-named active INTEGER NOT NULL DEFAULT 0, -- boolean schedule_json TEXT, -- optional: when this context is active created_at TEXT NOT NULL, PRIMARY KEY (user_id, name) ); ``` Why hybrid (typed for stable globals, KV for per-agent): - `tone` and allowed tip kinds are referenced by every recommendation — putting them in JSON imposes a parse on every read. - Per-agent prefs are open-ended (each agent declares its own keys) and validated on read against the agent's `pref_schema`, so KV is correct. `user_preferences.source = 'user' | 'inferred'` keeps explicit user overrides distinguishable from inferred values (the inference framework never overwrites a `source='user'` row). `user_contexts` ships in this ADR with **manual toggle only**. Auto-inference per agent type is tracked in #112–#116; cross-agent calendar/geo inference is out of scope. ### 3. Shared context-inference framework Each `InferredParam` in a manifest declares: ```python @dataclass class InferredParam: key: str # 'quietStart' ttl_sec: int # how often to recompute cold_start_default: Any # value used until enough history exists min_history: int # event count threshold infer: Callable[[UserHistory], Any] # pure function ``` The framework (`ml/agents/inference/`) owns: - Scheduling (recomputes per-param via the existing pre-compute scheduler). - Reading history from `tip_views` / `tip_feedback` / `agent_outputs`. - Writing results to `user_preferences` with `source='inferred'`. - Cold-start: returns `cold_start_default` until `min_history` is met. - Versioning: bumping `agent.version` invalidates inferred rows for that agent. - Observability: structured log per recompute (window size, output diff, latency). Each per-agent issue (#112–#116) implements only its `infer()` functions; everything else is the framework. ## Read-through API Stays small as N grows because every endpoint is registry-driven: ``` GET /api/profile → { user, prefs (grouped by scope), contexts, consents, agents[] } PATCH /api/profile/prefs/:scope → upserts user_preferences rows (source='user') PATCH /api/profile/consents → grant/revoke PATCH /api/profile/contexts → activate/deactivate / create GET /api/agents/registry → manifests; admin UI auto-renders forms from pref_schema ``` `GET /api/profile` is the read-through used by `ml/serving` and the web client; it's the single endpoint each consumer calls instead of reading the DB directly. ## Orchestrator flow under this ADR ``` 1. Load Profile = { user, prefs, active context, consents } via /api/profile. 2. From agent registry, filter eligible agents: - required consents granted - not silenced by active context (declared per-agent) - enabled in user_preferences (default: enabled) 3. Pull latest non-expired agent_outputs for the eligible set. 4. Build orchestrator prompt: - global prefs (tone, allowed tip kinds) - active context name as hint - agent snippets in eligibility order 5. LLM → tip. ``` No hardcoded agent list anywhere in the recommender. The orchestrator prompt template (`v4-orchestrator`) iterates whatever it was handed. ## Migration plan One PR per step; each independently deployable. 1. **Schema** — add the three tables; add `tone` and `tip_kinds_json` to `users`. 2. **Backfill** — write `users.consentGiven` rows into `user_consents` as `data:core`. Keep the column for one release, then drop. 3. **Manifest plumbing** — `ml/agents//manifest.py` for the existing five; `GET /api/agents/registry` proxy. 4. **Read-through API** — `/api/profile` + sub-endpoints. 5. **Orchestrator cutover** — registry-driven eligibility filter. 6. **Inference framework** (#111) — land it; migrate `time-of-day` (#112) as the proof. 7. **Per-agent inference** — #113–#116 land independently against the framework. 8. **Drop `users.consentGiven`** after one release. ## Consequences ### Positive - Adding an agent = one directory. Admin UI, prefs storage, consent storage, and inference all auto-pick-up. - Per-agent state lives next to the agent code; nothing global to edit. - User-controlled prefs and inferred prefs use the same storage but stay distinguishable (`source` column). - Consent revocation is row-level and time-stamped; aligns with the privacy stance in CLAUDE.md ("privacy is a feature, not a phase"). - Sets up cleanly for #27 (Calendar) and #28 (Health) — they register their own consent keys without schema changes. ### Negative / risks - **JSON validation on read** for per-agent prefs is later than column typing. Mitigated by validating in the manifest's load function and failing closed (use cold-start default if invalid). - **Two-table reads** for the orchestrator (registry + profile + outputs) add latency. Cached profile read keeps it sub-ms in practice. - **Migration window** during which `users.consentGiven` and `user_consents` both exist. Reads must consult both for one release; writes go to `user_consents` only. - **Auto-inference can mislead.** A wrong-but-confident inferred quiet window silences the user when they want pings. Mitigation: every inferred param is overrideable in admin/settings (`source='user'` takes precedence), and inferences only kick in past their `min_history` threshold. ## What this does NOT change - ADR-0013's agent set, snippet contract, or `agent_outputs` table. - ADR-0011's `userProfileFeatures` (ML-derived features, not user prefs). - ADR-0008's LiteLLM gateway pattern. - The orchestrator prompt template name (`v4-orchestrator`); the assembly rule changes, the contract does not.