From cba3f1a1842fdb59ea577d964dee63ddc69b1226 Mon Sep 17 00:00:00 2001 From: alvis Date: Sat, 25 Apr 2026 17:17:38 +0000 Subject: [PATCH] docs(services): update integrations + recommender READMEs for signal abstraction (#78) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit integrations/README — replace stale Connector interface and fictional libsodium vault with the actual SignalSource pattern, SQLite token table, and real OAuth routes. recommender/README — document the SignalAggregator pipeline, current policy registry, and actual /recommend + /feedback contract shapes. Co-Authored-By: Claude Sonnet 4.6 --- services/integrations/README.md | 55 ++++++++++++++++++++++----------- services/recommender/README.md | 47 ++++++++++++++++++---------- 2 files changed, 67 insertions(+), 35 deletions(-) diff --git a/services/integrations/README.md b/services/integrations/README.md index 3607e9f..448c291 100644 --- a/services/integrations/README.md +++ b/services/integrations/README.md @@ -2,30 +2,49 @@ Third-party connectors and the token vault. -## Connector interface +## Signal source interface + +Each connector implements `SignalSource` from `@oo/shared-types`: ```ts -interface Connector { - id: string // e.g. "todoist" - scopes: string[] // human-readable list shown in consent UI - beginOAuth(user): Promise<{ redirectUrl, state }> - finishOAuth(code, state): Promise - fetchSignals(user, since?): AsyncIterable - // incremental-sync cursor (Todoist sync_token, webhook timestamps, etc.) - // stored in Credential.meta; the connector owns its shape. - act?(user, action): Promise // optional write-back (complete task, etc.) - revoke(user): Promise // REQUIRED: provider-side token revocation on disconnect +interface SignalSource { + readonly id: string // e.g. "todoist" + fetchSignals(userId: string): Promise // returns normalized Signal[] + act?(userId: string, signalId: string, action: string): Promise // optional write-back } ``` +`SignalAggregator` (`services/api/src/signals/aggregator.ts`) fans out to all registered sources in parallel, isolating per-source failures. + ## Token vault -- Credentials encrypted at rest (libsodium sealed box); key from env/KMS. -- Refresh handled transparently; consumers never see raw tokens. -- One row per `(user, provider)` with provider-specific `meta`. +OAuth tokens stored in the `integration_tokens` SQLite table (`services/api/src/db/schema.ts`): -## Roadmap +| Column | Description | +|--------|-------------| +| `userId` | owner | +| `provider` | e.g. `todoist` | +| `accessToken` | OAuth access token (plain in dev; encrypted in prod via server secret store) | +| `tokenStatus` | `active` \| `needs_reconnect` | -- Phase 0: **Todoist** (OAuth2, read tasks, complete task). -- Phase 2: Google Calendar, Apple Health (web import), generic webhook ingress. -- Phase 5: public SDK so third parties can ship connectors. +On a 401 from the upstream API, the connector marks the token `needs_reconnect` and publishes `signals.integration.token_expired` so the client can prompt re-auth. + +## Routes + +| Method | Path | Description | +|--------|------|-------------| +| `GET` | `/api/integrations` | List connected integrations for current user | +| `GET` | `/api/integrations/todoist/connect` | Start Todoist OAuth flow | +| `GET` | `/api/integrations/todoist/callback` | OAuth callback — exchange code, store token | +| `DELETE` | `/api/integrations/:provider` | Disconnect + delete token | + +## Connectors + +| Connector | Status | Signals produced | +|-----------|--------|-----------------| +| Todoist | Phase 1 — active | `task` signals (today + overdue); `done` write-back | +| Google Calendar | Phase 2 — planned | `event` signals | + +## Extraction criteria + +Extract to its own process when credential blast-radius isolation requires it (e.g. token vault with KMS-backed encryption needs to run in a hardened sidecar) or when connector volume justifies separate scaling. diff --git a/services/recommender/README.md b/services/recommender/README.md index 4f4f7c5..e515160 100644 --- a/services/recommender/README.md +++ b/services/recommender/README.md @@ -1,29 +1,42 @@ # recommender -The core of oO. Takes a user + a context, returns **one** tip. +The core of oO. Takes a user + context, returns **one** tip. ## Contract ``` -POST /recommend - { user_id, context?: { time, timezone, client, ... } } - → { tip: { id, kind: "todo"|"advice", title, body, source, deep_link, meta } } +POST /api/recommend + { } (user inferred from session) + → { tip: { id, content, source, kind, sourceId?, rationale?, createdAt } } -POST /feedback - { user_id, tip_id, reaction: "done"|"snooze"|"dismiss", at } +POST /api/tip/:id/feedback + { action: "done"|"dismiss"|"snooze"|"helpful"|"not_helpful", dwellMs? } + → { ok: true } ``` -## Internals (stable seams) +## Pipeline -- **Candidate sources** — pluggable async generators. v0: Todoist tasks via `integrations`. Later: advice library, calendar nudges, health prompts. -- **Feature assembler** — fills the `context` blob (inline in Phase 0; calls feature store from M1). Never inlined into policy code. -- **Policy registry** — `Policy.pick(candidates, context) → tip`. Named entries: - - `random` — v0 (Phase 0). - - `bandit.linucb.pooled` — v1 (Phase 1). **Global-then-personalize**: pooled features shared across users; per-user residual once data allows. - - `remote` — delegates to `ml/serving` FastAPI scorer (Phase 1+). -- **Shadow hook** — every request optionally runs N shadow policies in parallel and logs their picks + estimated rewards. Promotion from shadow → A/B → launch is a separate, deliberate step (ADR-0002). -- **TipInstance persistence** — every decision writes `context_snapshot` (features seen at decision time). This is what makes offline replay honest. +1. **Signals** — `SignalAggregator.fetchAll(userId)` fans out to all registered `SignalSource` implementations in parallel. Currently: `TodoistSignalSource`. Add a source via `aggregator.register(new MySource())`. +2. **LLM candidates** — `POST /generate` on `ml/serving` returns `TipCandidate[]` from the `tip-generator` LiteLLM alias. +3. **Scoring** — all candidates sent to `ml/serving` active policy (`POST /score/egreedy`). Falls back to random if `ml/serving` is unreachable. +4. **Shadow policies** — active policy runs shadow policies in the same request for offline comparison (ADR-0002). Currently: `egreedy-v2` shadows `egreedy-v1`. +5. **Persistence** — `tipViews` + `tipScores` rows written on every serve; `tipFeedback` row on reaction. +6. **Reward delivery** — reaction triggers `POST /reward/egreedy` on `ml/serving` with inferred reward value. -## Phase 0 goal +## Signal normalization -`RandomPolicy` only. The service, contract, registry, shadow hook, and tip-instance persistence all exist; no ML yet. +Signals carry `features: Record` (bandit-ready) and `metadata: Record` (source-specific raw fields). The bandit treats features as an opaque dict — sources own their feature names. See ADR-0009. + +## Policy registry + +| Policy | Status | Notes | +|--------|--------|-------| +| `random` | Shadow | Fallback when ml/serving unreachable | +| `egreedy-v1` | **Active** | d=7, ADR-0007 | +| `egreedy-v2` | Shadow | d=12 + profile features, ADR-0012 | + +Shadow → active promotion requires offline sim + online agreement (ADR-0002). + +## Extraction criteria + +Extract to its own process at scaling hotspot: when `POST /recommend` p99 latency exceeds SLA or when recommendation CPU displaces API serving on shared host.