docs: ADR-0014 — unified Profile model + agent registry
Propose a shared substrate for per-user prefs, contexts, per-key consents, and per-agent state so adding an agent stays a manifest change. Updates CLAUDE.md, README, and architecture docs to reflect the multi-agent pipeline (ADR-0013) and the registry direction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
32
CLAUDE.md
32
CLAUDE.md
@@ -78,7 +78,7 @@ docs/ architecture notes, ADRs, API specs
|
||||
|
||||
## AI stack
|
||||
|
||||
oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
|
||||
oO generates tips through a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM assembles them into one tip. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
|
||||
|
||||
| Alias | Model | Used by |
|
||||
|-------|-------|---------|
|
||||
@@ -90,33 +90,37 @@ Env vars: `LITELLM_URL` (prod `https://llm.alogins.net`), `OLLAMA_URL` (Agap hos
|
||||
|
||||
Ollama and LiteLLM are **shared Agap services**, not oO services — they live in `agap_git/openai/docker-compose.yml` along with langfuse (observability). oO never starts them; ml-serving just calls the alias.
|
||||
|
||||
**LLM tip generation pipeline:**
|
||||
1. `ml/features/context.py` assembles user signals → structured prompt context
|
||||
2. `POST /generate` in `ml/serving` calls LiteLLM → returns `TipCandidate[]`
|
||||
3. Bandit policy in `ml/serving` scores + ranks candidates
|
||||
4. Best candidate returned as tip; reaction closes the online reward loop
|
||||
**Multi-agent tip generation pipeline (ADR-0013):**
|
||||
1. Pre-compute agents (`ml/agents/<id>/`) run on a schedule, each emitting a snippet into `agent_outputs` with a per-agent TTL
|
||||
2. On request, `recommender` (TS) loads the eligible agent set (registry-driven, ADR-0014) and pulls the freshest non-expired snippets
|
||||
3. `POST /recommend` in `ml/serving` assembles the orchestrator prompt (`v4-orchestrator`) and calls LiteLLM via the `tip-generator` alias
|
||||
4. Returned tip is logged in `tip_scores` with the contributing agent set; reaction is logged for observability (no bandit reward loop)
|
||||
|
||||
## Current phase
|
||||
|
||||
**M1 shipped (core + admin). M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.
|
||||
|
||||
Recent completions (M1 add-on):
|
||||
- ADR-0012 — ε-greedy v2 promotion (profile features, D=12) — 2026-04-26
|
||||
- Offline sim framework + MLflow integration — shipped in M1 add-on
|
||||
- Token-based admin auth for Playwright/CI — secured auth boundary
|
||||
Recent completions:
|
||||
- ADR-0013 — multi-agent recommendation: pre-computed agent snippets + orchestrator LLM (replaces ε-greedy bandit) — 2026-05-01
|
||||
- LLM context assembler + tip generation scaffold (#79, #88)
|
||||
- Model benchmarking for tip generation (#93, #95)
|
||||
- Admin UX refinements: feedback consolidation, settings placement (#100–102)
|
||||
- ADR-0012 — ε-greedy v2 (D=12) — 2026-04-26 (now superseded by ADR-0013)
|
||||
|
||||
Active work (M2):
|
||||
- ADR-0014 (proposed) — unified Profile model + agent registry + inference framework
|
||||
- Unified Profile model: prefs, contexts, consents, registry plumbing, orchestrator cutover (#30)
|
||||
- Shared context-inference framework for agents (#111)
|
||||
- Per-agent auto-inference: time-of-day (#112), focus-area (#113), momentum (#114), overdue-task (#115), recent-patterns (#116)
|
||||
- Signal abstraction for multi-source support (#78)
|
||||
- Per-user feature freshness SLAs (#61, ADR-0011 phase B)
|
||||
- LLM context assembler + tip generation scaffold (#79, #88)
|
||||
- Model benchmarking for tip generation (#93)
|
||||
- Admin UX refinements: feedback consolidation, settings placement (#100–102)
|
||||
|
||||
## What NOT to do
|
||||
|
||||
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
|
||||
- Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
|
||||
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (bandit, LLM, hybrid), keep contract.
|
||||
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (multi-agent orchestrator today, future LLM/hybrid variants), keep contract.
|
||||
- Don't hardcode the agent list. The orchestrator is registry-driven (ADR-0014); adding/removing an agent is a manifest change in `ml/agents/<id>/`, never a recommender edit.
|
||||
- Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
|
||||
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
|
||||
- Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name.
|
||||
|
||||
41
README.md
41
README.md
@@ -69,7 +69,7 @@ docs/ architecture, adr, api
|
||||
|
||||
## AI stack
|
||||
|
||||
oO is AI-native: the recommender's job is to **rank**, not to write. An LLM generates candidate tips from the user's context; the bandit picks the best one.
|
||||
oO is AI-native. Domain-specialized agents pre-compute snippets describing the user's state from one angle each; an orchestrator LLM reasons over the assembled snippets and produces one tip (ADR-0013). The orchestrator iterates a registry, not a hardcoded list (ADR-0014) — adding an agent is a manifest change, nothing else.
|
||||
|
||||
### Three-tier layout
|
||||
|
||||
@@ -79,25 +79,28 @@ oO is AI-native: the recommender's job is to **rank**, not to write. An LLM gene
|
||||
| Routing | **LiteLLM** | Unified OpenAI-compatible API; model aliases; cloud fallback | `llm.alogins.net` (Agap shared) |
|
||||
| Testing | **OpenWebUI** | Prompt iteration, model comparison, manual evals | `ai.alogins.net` (Agap shared) |
|
||||
|
||||
### Tip generation pipeline (Phase 2 target)
|
||||
### Tip generation pipeline (ADR-0013, M2)
|
||||
|
||||
```
|
||||
User signals ──▶ Context assembler ──▶ LiteLLM ──▶ Ollama (local)
|
||||
(tasks, calendar, (ml/features/) (routing) or cloud fallback
|
||||
patterns, time)
|
||||
User signals Pre-compute agents (every 15 min)
|
||||
(tasks, calendar, ──▶ ml/agents/{overdue-task, momentum, ──▶ agent_outputs
|
||||
patterns, time) time-of-day, recent-patterns, (per-agent TTL)
|
||||
focus-area, ...}
|
||||
│
|
||||
Eligibility filter: required consents + │
|
||||
active context + per-user prefs (ADR-0014) ◀──┘
|
||||
▼
|
||||
N typed TipCandidates
|
||||
{content, kind, model,
|
||||
prompt_version, confidence}
|
||||
Orchestrator prompt (`v4-orchestrator`)
|
||||
= global prefs + active context + snippets
|
||||
▼
|
||||
Bandit policy (ml/serving)
|
||||
scores + ranks candidates
|
||||
LiteLLM ──▶ Ollama (local) / cloud fallback
|
||||
▼
|
||||
Best tip shown
|
||||
Tip shown to user
|
||||
▼
|
||||
User reaction (done / snooze / dismiss + dwell)
|
||||
▼
|
||||
Online bandit update + prompt_version tracking
|
||||
Logged to tip_feedback for observability
|
||||
(no online ML reward loop — see ADR-0013)
|
||||
```
|
||||
|
||||
**Why LiteLLM as gateway:** All LLM calls use a single `LITELLM_URL` env var. Swapping from qwen2.5 to llama3.2, or routing a fraction to Claude for A/B, is a config change in LiteLLM — zero code change in oO. The model name in `tip_scores` tells you exactly which model produced each tip.
|
||||
@@ -194,6 +197,20 @@ oO is ML-heavy. Without a cockpit, every model change ships blind. This console
|
||||
### Phase 2 — AI tips + multi-source signals *(M2)* in progress
|
||||
Goal: tips are AI-generated from user context, not just raw Todoist tasks. Multiple signal sources feed a generalized pipeline. Research-intensive milestone.
|
||||
|
||||
**Architectural shift (mid-M2):** the bandit-ranks-LLM-candidates design from earlier in M2 was replaced with a multi-agent pipeline (ADR-0013): pre-compute agents emit prompt snippets, an orchestrator LLM produces the tip directly. ADR-0014 layers a unified Profile + agent registry + auto-inference framework on top so the system generalizes cleanly to N agents.
|
||||
|
||||
**Multi-agent recommendation (ADR-0013, shipped):**
|
||||
- [x] `agent_outputs` table + per-agent TTL caching
|
||||
- [x] Five initial agents: `overdue-task`, `momentum`, `time-of-day`, `recent-patterns`, `focus-area`
|
||||
- [x] Agent pre-compute scheduler
|
||||
- [x] Orchestrator cutover — recommender calls `ml/serving` with snippet list, no bandit scoring
|
||||
- [x] Bandit endpoints + shadow policy machinery removed
|
||||
|
||||
**Unified Profile + agent registry (ADR-0014, in progress):**
|
||||
- [ ] Unified Profile model: prefs, contexts, consents + manifest plumbing + orchestrator cutover (#30)
|
||||
- [ ] Shared context-inference framework (#111)
|
||||
- [ ] Per-agent auto-inference: `time-of-day` (#112), `focus-area` (#113), `momentum` (#114), `overdue-task` (#115), `recent-patterns` (#116)
|
||||
|
||||
**AI infrastructure (unblock everything else):**
|
||||
- [ ] `ai` compose profile — Ollama + LiteLLM for local dev; env vars `OLLAMA_URL` / `LITELLM_URL` (#86)
|
||||
- [ ] AI gateway — wire `ml/serving` to LiteLLM; model aliases `tip-generator` + `embedder` (#87)
|
||||
|
||||
File diff suppressed because one or more lines are too long
230
docs/adr/0014-unified-profile-and-agent-registry.md
Normal file
230
docs/adr/0014-unified-profile-and-agent-registry.md
Normal file
@@ -0,0 +1,230 @@
|
||||
# ADR-0014 — Unified Profile model + agent registry
|
||||
|
||||
**Status:** Proposed
|
||||
**Date:** 2026-05-05
|
||||
**Issues:** #30, #111, #112, #113, #114, #115, #116
|
||||
**Supersedes (data model):** ADR-0013 (the agent set stands; this ADR replaces the implicit assumption that prefs/contexts/consents are hardcoded on `users`).
|
||||
|
||||
## Context
|
||||
|
||||
ADR-0013 introduced the multi-agent pipeline: N pre-compute agents emit
|
||||
prompt snippets, an orchestrator LLM assembles them into a tip. The ADR
|
||||
specified the `agent_outputs` table and the orchestrator contract, but
|
||||
left several questions open:
|
||||
|
||||
1. **Where do user preferences live?** `users.consentGiven` is a single
|
||||
boolean. There is no place for quiet hours, tone, allowed tip kinds,
|
||||
or per-integration consent. Each new preference would mean another
|
||||
typed column on `users` — and worse, every new agent needs its own
|
||||
tunable parameters (focus areas, momentum baseline, lateness tolerance)
|
||||
that are clearly per-agent state, not global user state.
|
||||
2. **How are agents discovered?** The orchestrator currently iterates a
|
||||
hardcoded list. Adding an agent means touching the recommender, the
|
||||
admin UI, and the prefs schema in three places.
|
||||
3. **How does context (work / home / vacation) interact with agents?**
|
||||
Some agents should be silenced in some contexts. There is no model.
|
||||
4. **How is per-user agent configuration learned?** Issues #112–#116
|
||||
each want to auto-infer parameters (quiet hours, focus areas, etc.)
|
||||
from history. Without a shared substrate they each reinvent storage,
|
||||
recompute cadence, and cold-start fallback.
|
||||
|
||||
The current ADR-0013 design works for five agents. It will not work for
|
||||
twenty without becoming a tangle.
|
||||
|
||||
## Decision
|
||||
|
||||
Three changes, designed to compose:
|
||||
|
||||
### 1. Agents are plugins with declared schemas
|
||||
|
||||
Every agent ships a manifest (Python, lives next to its code in
|
||||
`ml/agents/<id>/manifest.py`):
|
||||
|
||||
```python
|
||||
class AgentManifest:
|
||||
id: str # 'time-of-day'
|
||||
version: str # bump invalidates cached outputs + inferences
|
||||
pref_schema: dict # JSON Schema for user-tunable knobs
|
||||
context_schema: list[str] # signals it reads, e.g. ['todoist.tasks']
|
||||
required_consents: list[str] # ['data:todoist', 'agent:time-of-day']
|
||||
output_contract: dict # snippet shape (free text + optional tags)
|
||||
ttl_sec: int # snippet freshness for agent_outputs
|
||||
inferred_params: list[InferredParam] # see §3
|
||||
```
|
||||
|
||||
The manifest is the **single point of registration**. The orchestrator,
|
||||
admin UI, and inference framework all read from it. Adding an agent is
|
||||
adding one directory in `ml/agents/` — no edits elsewhere.
|
||||
|
||||
A `GET /api/agents/registry` endpoint (TS recommender → Python proxy)
|
||||
exposes manifests so the admin app can auto-render configuration UI from
|
||||
each `pref_schema`.
|
||||
|
||||
### 2. Unified Profile data model
|
||||
|
||||
Three new tables replace the implicit "fields-on-users" pattern.
|
||||
`users.consentGiven` collapses into `user_consents` (one row,
|
||||
`consent_key='data:core'`); existing data migrates in a single
|
||||
backfill.
|
||||
|
||||
```sql
|
||||
-- Hybrid: typed columns where stable, KV where open-ended.
|
||||
-- Stable globals stay on users (added in this ADR):
|
||||
ALTER TABLE users ADD COLUMN tone TEXT; -- 'direct'|'gentle'|'motivational'
|
||||
ALTER TABLE users ADD COLUMN tip_kinds_json TEXT; -- JSON: allowed tip kinds
|
||||
|
||||
-- Open-ended per-agent prefs land here:
|
||||
CREATE TABLE user_preferences (
|
||||
user_id TEXT NOT NULL REFERENCES users(id),
|
||||
scope TEXT NOT NULL, -- 'orchestrator' | 'agent:<id>'
|
||||
key TEXT NOT NULL, -- e.g. 'quietStart', 'focusAreas'
|
||||
value_json TEXT NOT NULL, -- agent validates against its pref_schema on read
|
||||
updated_at TEXT NOT NULL,
|
||||
source TEXT NOT NULL DEFAULT 'user', -- 'user' | 'inferred'
|
||||
PRIMARY KEY (user_id, scope, key)
|
||||
);
|
||||
|
||||
CREATE TABLE user_consents (
|
||||
user_id TEXT NOT NULL REFERENCES users(id),
|
||||
consent_key TEXT NOT NULL, -- 'data:todoist' | 'data:calendar' | 'agent:focus-area'
|
||||
granted_at TEXT NOT NULL,
|
||||
revoked_at TEXT, -- null = currently active
|
||||
PRIMARY KEY (user_id, consent_key)
|
||||
);
|
||||
|
||||
CREATE TABLE user_contexts (
|
||||
user_id TEXT NOT NULL REFERENCES users(id),
|
||||
name TEXT NOT NULL, -- 'work' | 'home' | 'vacation' | user-named
|
||||
active INTEGER NOT NULL DEFAULT 0, -- boolean
|
||||
schedule_json TEXT, -- optional: when this context is active
|
||||
created_at TEXT NOT NULL,
|
||||
PRIMARY KEY (user_id, name)
|
||||
);
|
||||
```
|
||||
|
||||
Why hybrid (typed for stable globals, KV for per-agent):
|
||||
|
||||
- `tone` and allowed tip kinds are referenced by every recommendation —
|
||||
putting them in JSON imposes a parse on every read.
|
||||
- Per-agent prefs are open-ended (each agent declares its own keys) and
|
||||
validated on read against the agent's `pref_schema`, so KV is correct.
|
||||
|
||||
`user_preferences.source = 'user' | 'inferred'` keeps explicit user
|
||||
overrides distinguishable from inferred values (the inference framework
|
||||
never overwrites a `source='user'` row).
|
||||
|
||||
`user_contexts` ships in this ADR with **manual toggle only**.
|
||||
Auto-inference per agent type is tracked in #112–#116; cross-agent
|
||||
calendar/geo inference is out of scope.
|
||||
|
||||
### 3. Shared context-inference framework
|
||||
|
||||
Each `InferredParam` in a manifest declares:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class InferredParam:
|
||||
key: str # 'quietStart'
|
||||
ttl_sec: int # how often to recompute
|
||||
cold_start_default: Any # value used until enough history exists
|
||||
min_history: int # event count threshold
|
||||
infer: Callable[[UserHistory], Any] # pure function
|
||||
```
|
||||
|
||||
The framework (`ml/agents/inference/`) owns:
|
||||
|
||||
- Scheduling (recomputes per-param via the existing pre-compute scheduler).
|
||||
- Reading history from `tip_views` / `tip_feedback` / `agent_outputs`.
|
||||
- Writing results to `user_preferences` with `source='inferred'`.
|
||||
- Cold-start: returns `cold_start_default` until `min_history` is met.
|
||||
- Versioning: bumping `agent.version` invalidates inferred rows for that agent.
|
||||
- Observability: structured log per recompute (window size, output diff, latency).
|
||||
|
||||
Each per-agent issue (#112–#116) implements only its `infer()` functions;
|
||||
everything else is the framework.
|
||||
|
||||
## Read-through API
|
||||
|
||||
Stays small as N grows because every endpoint is registry-driven:
|
||||
|
||||
```
|
||||
GET /api/profile → { user, prefs (grouped by scope), contexts, consents, agents[] }
|
||||
PATCH /api/profile/prefs/:scope → upserts user_preferences rows (source='user')
|
||||
PATCH /api/profile/consents → grant/revoke
|
||||
PATCH /api/profile/contexts → activate/deactivate / create
|
||||
GET /api/agents/registry → manifests; admin UI auto-renders forms from pref_schema
|
||||
```
|
||||
|
||||
`GET /api/profile` is the read-through used by `ml/serving` and the web
|
||||
client; it's the single endpoint each consumer calls instead of reading
|
||||
the DB directly.
|
||||
|
||||
## Orchestrator flow under this ADR
|
||||
|
||||
```
|
||||
1. Load Profile = { user, prefs, active context, consents } via /api/profile.
|
||||
2. From agent registry, filter eligible agents:
|
||||
- required consents granted
|
||||
- not silenced by active context (declared per-agent)
|
||||
- enabled in user_preferences (default: enabled)
|
||||
3. Pull latest non-expired agent_outputs for the eligible set.
|
||||
4. Build orchestrator prompt:
|
||||
- global prefs (tone, allowed tip kinds)
|
||||
- active context name as hint
|
||||
- agent snippets in eligibility order
|
||||
5. LLM → tip.
|
||||
```
|
||||
|
||||
No hardcoded agent list anywhere in the recommender. The orchestrator
|
||||
prompt template (`v4-orchestrator`) iterates whatever it was handed.
|
||||
|
||||
## Migration plan
|
||||
|
||||
One PR per step; each independently deployable.
|
||||
|
||||
1. **Schema** — add the three tables; add `tone` and `tip_kinds_json` to `users`.
|
||||
2. **Backfill** — write `users.consentGiven` rows into `user_consents` as `data:core`. Keep the column for one release, then drop.
|
||||
3. **Manifest plumbing** — `ml/agents/<id>/manifest.py` for the existing five; `GET /api/agents/registry` proxy.
|
||||
4. **Read-through API** — `/api/profile` + sub-endpoints.
|
||||
5. **Orchestrator cutover** — registry-driven eligibility filter.
|
||||
6. **Inference framework** (#111) — land it; migrate `time-of-day` (#112) as the proof.
|
||||
7. **Per-agent inference** — #113–#116 land independently against the framework.
|
||||
8. **Drop `users.consentGiven`** after one release.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Adding an agent = one directory. Admin UI, prefs storage, consent
|
||||
storage, and inference all auto-pick-up.
|
||||
- Per-agent state lives next to the agent code; nothing global to edit.
|
||||
- User-controlled prefs and inferred prefs use the same storage but stay
|
||||
distinguishable (`source` column).
|
||||
- Consent revocation is row-level and time-stamped; aligns with the
|
||||
privacy stance in CLAUDE.md ("privacy is a feature, not a phase").
|
||||
- Sets up cleanly for #27 (Calendar) and #28 (Health) — they register
|
||||
their own consent keys without schema changes.
|
||||
|
||||
### Negative / risks
|
||||
|
||||
- **JSON validation on read** for per-agent prefs is later than column
|
||||
typing. Mitigated by validating in the manifest's load function and
|
||||
failing closed (use cold-start default if invalid).
|
||||
- **Two-table reads** for the orchestrator (registry + profile + outputs)
|
||||
add latency. Cached profile read keeps it sub-ms in practice.
|
||||
- **Migration window** during which `users.consentGiven` and
|
||||
`user_consents` both exist. Reads must consult both for one release;
|
||||
writes go to `user_consents` only.
|
||||
- **Auto-inference can mislead.** A wrong-but-confident inferred quiet
|
||||
window silences the user when they want pings. Mitigation: every
|
||||
inferred param is overrideable in admin/settings (`source='user'`
|
||||
takes precedence), and inferences only kick in past their
|
||||
`min_history` threshold.
|
||||
|
||||
## What this does NOT change
|
||||
|
||||
- ADR-0013's agent set, snippet contract, or `agent_outputs` table.
|
||||
- ADR-0011's `userProfileFeatures` (ML-derived features, not user prefs).
|
||||
- ADR-0008's LiteLLM gateway pattern.
|
||||
- The orchestrator prompt template name (`v4-orchestrator`); the assembly
|
||||
rule changes, the contract does not.
|
||||
@@ -25,12 +25,37 @@ Session auth
|
||||
expires_at
|
||||
revoked_at?
|
||||
|
||||
Profile profile
|
||||
user_id (pk)
|
||||
timezone
|
||||
quiet_hours jsonb: [{start,end,days}]
|
||||
contexts jsonb: [{name,predicate}] introduced in Phase 2
|
||||
consents jsonb: {integration: {read,write,retain_days}}
|
||||
User (extended) profile ADR-0014
|
||||
+ tone 'direct' | 'gentle' | 'motivational'
|
||||
+ tip_kinds_json jsonb: allowed tip kinds (stable globals)
|
||||
|
||||
UserPreference profile ADR-0014
|
||||
user_id, scope, key (pk)
|
||||
scope 'orchestrator' | 'agent:<id>'
|
||||
value_json open-ended; agent validates against its pref_schema on read
|
||||
source 'user' | 'inferred' (inferred never overwrites user)
|
||||
updated_at
|
||||
|
||||
UserConsent profile ADR-0014
|
||||
user_id, consent_key (pk)
|
||||
consent_key 'data:todoist' | 'data:calendar' | 'agent:focus-area' | ...
|
||||
granted_at
|
||||
revoked_at? null = currently active
|
||||
|
||||
UserContext profile ADR-0014
|
||||
user_id, name (pk) 'work' | 'home' | 'vacation' | user-named
|
||||
active manual toggle in M2; auto-inference per agent in #112-#116
|
||||
schedule_json? optional: when this context is active
|
||||
created_at
|
||||
|
||||
AgentOutput recommender ADR-0013
|
||||
id (pk)
|
||||
user_id
|
||||
agent_id e.g. 'overdue-task' (matches a manifest)
|
||||
prompt_text snippet for the orchestrator prompt
|
||||
signals_snapshot jsonb: inputs the agent consumed
|
||||
computed_at, expires_at computed_at + manifest.ttl_sec
|
||||
agent_version bump to invalidate cached outputs on logic changes
|
||||
|
||||
Credential integrations
|
||||
user_id
|
||||
@@ -53,10 +78,10 @@ Event events
|
||||
TipInstance recommender
|
||||
tip_id (ulid)
|
||||
user_id
|
||||
policy_name "random" | "bandit.linucb" | "remote:v3"
|
||||
policy_name "v4-orchestrator" (ADR-0013) | legacy bandit names retained for history
|
||||
policy_version
|
||||
candidate_source "todoist" | "advice.library" | ...
|
||||
context_snapshot jsonb: features seen at decision time
|
||||
candidate_source "todoist" | "advice.library" | "agent-orchestrator" | ...
|
||||
context_snapshot jsonb: features + agent snippets seen at decision time
|
||||
tip jsonb: {kind,title,body,source,deep_link,meta}
|
||||
created_at
|
||||
shown_at? set when the client reports render
|
||||
|
||||
@@ -48,6 +48,8 @@ User reactions (done / snooze / dismiss) are events too. They close the loop as
|
||||
- **Feast** for feature store when we get there; homegrown adapter until then (Phase 1 seam).
|
||||
- **MLflow** for model registry and experiment tracking; deployed at `o.alogins.net/mlflow`.
|
||||
- **Auth.js** embedded behind an OIDC-shaped boundary (ADR-0004). Swap to a standalone OIDC provider when mobile ships.
|
||||
- **Multi-agent recommendation** (ADR-0013) — pre-compute agents emit prompt snippets, an orchestrator LLM produces the tip. Replaced the ε-greedy bandit (ADR-0007/0012) for explainability, cold-start, and decoupling generation from selection.
|
||||
- **Registry-driven agents + unified Profile** (ADR-0014) — agents are plugins with declared manifests; per-user prefs, contexts, and per-key consents live in shared tables; auto-inferred parameters share a common framework. Adding an agent is a manifest change.
|
||||
- **k3s** as the first step beyond docker-compose — no "compose → full k8s" cliff.
|
||||
|
||||
## AI stack
|
||||
@@ -59,30 +61,43 @@ All LLM inference routes through **LiteLLM** (`llm.alogins.net`) backed by **Oll
|
||||
|
||||
**OpenWebUI** (`ai.alogins.net`) is the human-facing interface for prompt iteration and model testing during development.
|
||||
|
||||
## Decision flow for a new tip (Phase 2 target)
|
||||
## Decision flow for a new tip (M2, ADR-0013 + ADR-0014)
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────┐
|
||||
│ Pre-compute (every 15 min, per registered agent) │
|
||||
│ ml/agents/<id> → prompt snippet → agent_outputs │
|
||||
│ TTL per manifest; agent_version invalidates │
|
||||
└────────────────────────────────────────────────┘
|
||||
|
||||
client ─► gateway ─► recommender (TS)
|
||||
│
|
||||
├─► profile: GET /api/profile
|
||||
│ (user, prefs, active context, consents)
|
||||
│
|
||||
├─► registry: GET /api/agents/registry
|
||||
│ (manifests; eligibility filter inputs)
|
||||
│
|
||||
├─► outputs: pull freshest non-expired agent_outputs
|
||||
│ for eligible agents (consents granted,
|
||||
│ not silenced by active context, enabled)
|
||||
│
|
||||
▼
|
||||
ml/serving (Python)
|
||||
│
|
||||
├─► context: ml/features/context.py
|
||||
│ (tasks + reactions + time patterns → prompt)
|
||||
├─► assemble: v4-orchestrator prompt
|
||||
│ = global prefs + active context + snippets
|
||||
│
|
||||
├─► generate: LiteLLM → Ollama
|
||||
│ → N TipCandidates {content, kind, model, prompt_version}
|
||||
├─► generate: LiteLLM → Ollama → one tip
|
||||
│
|
||||
├─► score: bandit policy scores each candidate
|
||||
│
|
||||
├─► shadows: shadow policies log picks without serving
|
||||
│
|
||||
└─► persist: tip_scores {candidate, policy, features, latency}
|
||||
◄─ best TipCandidate
|
||||
└─► persist: tip_scores {tip, contributing agents,
|
||||
prompt_version, llm_model, latency}
|
||||
◄─ tip
|
||||
```
|
||||
|
||||
**Phase 1 (shipped M1):** candidates come from Todoist task list, no LLM. The bandit scores tasks directly.
|
||||
**Evolution:**
|
||||
- **Phase 1 (M1):** candidates from Todoist; ε-greedy bandit scored tasks directly (ADR-0007, ADR-0012). Superseded.
|
||||
- **Phase 2 early (M2):** LLM-generated candidates ranked by bandit. Superseded mid-milestone.
|
||||
- **Phase 2 current (M2):** multi-agent pipeline (ADR-0013), registry-driven and registry-extensible (ADR-0014). No bandit; the orchestrator LLM reasons over named agent snippets.
|
||||
|
||||
**Phase 2 (shipped M2):** LLM candidates are generated in parallel with Todoist fetch. Both pools are merged, scored by the bandit, and the winner served. `tip_scores` tracks `prompt_version`, `llm_model`, and `tip_kind` for every row.
|
||||
|
||||
Feedback: `POST /feedback → events.emit(reaction)` → online bandit update + `prompt_version` tracked for A/B analysis.
|
||||
Feedback: `POST /feedback → events.emit(reaction)`. No online ML reward loop (ADR-0013 §Consequences); reactions are logged in `tip_feedback` for observability and potential future supervised learning.
|
||||
|
||||
@@ -26,7 +26,7 @@ User taps "Delete account" in settings → hard confirm → `User.deleted_at` se
|
||||
|
||||
## Scope boundaries
|
||||
|
||||
Each integration declares the scopes it requests and the features it derives. The `Profile.consents` column is the source of truth; a scope removed from consent short-circuits derived-feature computation at the feature store.
|
||||
Each integration and each agent declares the consent keys it requires (`data:todoist`, `agent:focus-area`, ...) in its manifest. The `user_consents` table is the source of truth (per-key rows, revocation is a `revoked_at` write — never a delete, so audits stay clean). A revoked consent short-circuits derived-feature computation at the feature store and removes the dependent agent from the orchestrator's eligible set on the next tip. See ADR-0014.
|
||||
|
||||
## Audit
|
||||
|
||||
|
||||
Reference in New Issue
Block a user