docs(services): update integrations + recommender READMEs for signal abstraction (#78)
integrations/README — replace stale Connector interface and fictional libsodium vault with the actual SignalSource pattern, SQLite token table, and real OAuth routes. recommender/README — document the SignalAggregator pipeline, current policy registry, and actual /recommend + /feedback contract shapes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -2,30 +2,49 @@
|
|||||||
|
|
||||||
Third-party connectors and the token vault.
|
Third-party connectors and the token vault.
|
||||||
|
|
||||||
## Connector interface
|
## Signal source interface
|
||||||
|
|
||||||
|
Each connector implements `SignalSource` from `@oo/shared-types`:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
interface Connector {
|
interface SignalSource {
|
||||||
id: string // e.g. "todoist"
|
readonly id: string // e.g. "todoist"
|
||||||
scopes: string[] // human-readable list shown in consent UI
|
fetchSignals(userId: string): Promise<Signal[]> // returns normalized Signal[]
|
||||||
beginOAuth(user): Promise<{ redirectUrl, state }>
|
act?(userId: string, signalId: string, action: string): Promise<void> // optional write-back
|
||||||
finishOAuth(code, state): Promise<StoredCredential>
|
|
||||||
fetchSignals(user, since?): AsyncIterable<NormalizedEvent>
|
|
||||||
// incremental-sync cursor (Todoist sync_token, webhook timestamps, etc.)
|
|
||||||
// stored in Credential.meta; the connector owns its shape.
|
|
||||||
act?(user, action): Promise<void> // optional write-back (complete task, etc.)
|
|
||||||
revoke(user): Promise<void> // REQUIRED: provider-side token revocation on disconnect
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
`SignalAggregator` (`services/api/src/signals/aggregator.ts`) fans out to all registered sources in parallel, isolating per-source failures.
|
||||||
|
|
||||||
## Token vault
|
## Token vault
|
||||||
|
|
||||||
- Credentials encrypted at rest (libsodium sealed box); key from env/KMS.
|
OAuth tokens stored in the `integration_tokens` SQLite table (`services/api/src/db/schema.ts`):
|
||||||
- Refresh handled transparently; consumers never see raw tokens.
|
|
||||||
- One row per `(user, provider)` with provider-specific `meta`.
|
|
||||||
|
|
||||||
## Roadmap
|
| Column | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `userId` | owner |
|
||||||
|
| `provider` | e.g. `todoist` |
|
||||||
|
| `accessToken` | OAuth access token (plain in dev; encrypted in prod via server secret store) |
|
||||||
|
| `tokenStatus` | `active` \| `needs_reconnect` |
|
||||||
|
|
||||||
- Phase 0: **Todoist** (OAuth2, read tasks, complete task).
|
On a 401 from the upstream API, the connector marks the token `needs_reconnect` and publishes `signals.integration.token_expired` so the client can prompt re-auth.
|
||||||
- Phase 2: Google Calendar, Apple Health (web import), generic webhook ingress.
|
|
||||||
- Phase 5: public SDK so third parties can ship connectors.
|
## Routes
|
||||||
|
|
||||||
|
| Method | Path | Description |
|
||||||
|
|--------|------|-------------|
|
||||||
|
| `GET` | `/api/integrations` | List connected integrations for current user |
|
||||||
|
| `GET` | `/api/integrations/todoist/connect` | Start Todoist OAuth flow |
|
||||||
|
| `GET` | `/api/integrations/todoist/callback` | OAuth callback — exchange code, store token |
|
||||||
|
| `DELETE` | `/api/integrations/:provider` | Disconnect + delete token |
|
||||||
|
|
||||||
|
## Connectors
|
||||||
|
|
||||||
|
| Connector | Status | Signals produced |
|
||||||
|
|-----------|--------|-----------------|
|
||||||
|
| Todoist | Phase 1 — active | `task` signals (today + overdue); `done` write-back |
|
||||||
|
| Google Calendar | Phase 2 — planned | `event` signals |
|
||||||
|
|
||||||
|
## Extraction criteria
|
||||||
|
|
||||||
|
Extract to its own process when credential blast-radius isolation requires it (e.g. token vault with KMS-backed encryption needs to run in a hardened sidecar) or when connector volume justifies separate scaling.
|
||||||
|
|||||||
@@ -1,29 +1,42 @@
|
|||||||
# recommender
|
# recommender
|
||||||
|
|
||||||
The core of oO. Takes a user + a context, returns **one** tip.
|
The core of oO. Takes a user + context, returns **one** tip.
|
||||||
|
|
||||||
## Contract
|
## Contract
|
||||||
|
|
||||||
```
|
```
|
||||||
POST /recommend
|
POST /api/recommend
|
||||||
{ user_id, context?: { time, timezone, client, ... } }
|
{ } (user inferred from session)
|
||||||
→ { tip: { id, kind: "todo"|"advice", title, body, source, deep_link, meta } }
|
→ { tip: { id, content, source, kind, sourceId?, rationale?, createdAt } }
|
||||||
|
|
||||||
POST /feedback
|
POST /api/tip/:id/feedback
|
||||||
{ user_id, tip_id, reaction: "done"|"snooze"|"dismiss", at }
|
{ action: "done"|"dismiss"|"snooze"|"helpful"|"not_helpful", dwellMs? }
|
||||||
|
→ { ok: true }
|
||||||
```
|
```
|
||||||
|
|
||||||
## Internals (stable seams)
|
## Pipeline
|
||||||
|
|
||||||
- **Candidate sources** — pluggable async generators. v0: Todoist tasks via `integrations`. Later: advice library, calendar nudges, health prompts.
|
1. **Signals** — `SignalAggregator.fetchAll(userId)` fans out to all registered `SignalSource` implementations in parallel. Currently: `TodoistSignalSource`. Add a source via `aggregator.register(new MySource())`.
|
||||||
- **Feature assembler** — fills the `context` blob (inline in Phase 0; calls feature store from M1). Never inlined into policy code.
|
2. **LLM candidates** — `POST /generate` on `ml/serving` returns `TipCandidate[]` from the `tip-generator` LiteLLM alias.
|
||||||
- **Policy registry** — `Policy.pick(candidates, context) → tip`. Named entries:
|
3. **Scoring** — all candidates sent to `ml/serving` active policy (`POST /score/egreedy`). Falls back to random if `ml/serving` is unreachable.
|
||||||
- `random` — v0 (Phase 0).
|
4. **Shadow policies** — active policy runs shadow policies in the same request for offline comparison (ADR-0002). Currently: `egreedy-v2` shadows `egreedy-v1`.
|
||||||
- `bandit.linucb.pooled` — v1 (Phase 1). **Global-then-personalize**: pooled features shared across users; per-user residual once data allows.
|
5. **Persistence** — `tipViews` + `tipScores` rows written on every serve; `tipFeedback` row on reaction.
|
||||||
- `remote` — delegates to `ml/serving` FastAPI scorer (Phase 1+).
|
6. **Reward delivery** — reaction triggers `POST /reward/egreedy` on `ml/serving` with inferred reward value.
|
||||||
- **Shadow hook** — every request optionally runs N shadow policies in parallel and logs their picks + estimated rewards. Promotion from shadow → A/B → launch is a separate, deliberate step (ADR-0002).
|
|
||||||
- **TipInstance persistence** — every decision writes `context_snapshot` (features seen at decision time). This is what makes offline replay honest.
|
|
||||||
|
|
||||||
## Phase 0 goal
|
## Signal normalization
|
||||||
|
|
||||||
`RandomPolicy` only. The service, contract, registry, shadow hook, and tip-instance persistence all exist; no ML yet.
|
Signals carry `features: Record<string, number | boolean>` (bandit-ready) and `metadata: Record<string, unknown>` (source-specific raw fields). The bandit treats features as an opaque dict — sources own their feature names. See ADR-0009.
|
||||||
|
|
||||||
|
## Policy registry
|
||||||
|
|
||||||
|
| Policy | Status | Notes |
|
||||||
|
|--------|--------|-------|
|
||||||
|
| `random` | Shadow | Fallback when ml/serving unreachable |
|
||||||
|
| `egreedy-v1` | **Active** | d=7, ADR-0007 |
|
||||||
|
| `egreedy-v2` | Shadow | d=12 + profile features, ADR-0012 |
|
||||||
|
|
||||||
|
Shadow → active promotion requires offline sim + online agreement (ADR-0002).
|
||||||
|
|
||||||
|
## Extraction criteria
|
||||||
|
|
||||||
|
Extract to its own process at scaling hotspot: when `POST /recommend` p99 latency exceeds SLA or when recommendation CPU displaces API serving on shared host.
|
||||||
|
|||||||
Reference in New Issue
Block a user