feat: SignalSource abstraction — generalize signal ingestion beyond Todoist (#78)

- Add Signal + SignalSource interfaces to packages/shared-types
- TipCandidate.features widened to Record<string,number|boolean> to match Signal
- TodoistSignalSource: encapsulates fetch, cache, 401 handling, bus events, and act()
- SignalAggregator: parallel fan-out across sources with per-source failure isolation
- Recommender refactored to consume Signal[] via aggregator; source action dispatch via aggregator.act()
- ADR-0009: signal normalization strategy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-18 01:11:56 +00:00
parent 46dee7377e
commit e3ca3ba733
8 changed files with 289 additions and 122 deletions

View File

@@ -0,0 +1,53 @@
# ADR-0009 — Signal normalization strategy
**Status:** Accepted
**Date:** 2026-04-18
**Issue:** #78
## Context
The recommender was hard-wired to Todoist: task fetch, cache, and feature extraction lived inside `recommender.ts` with no abstraction boundary. Adding Google Calendar, Apple Health, or manual input sources would have required forking the pipeline per source.
## Decision
Introduce two abstractions in `packages/shared-types`:
```typescript
interface Signal {
id: string;
source: string;
kind: 'task' | 'event' | 'habit' | 'insight';
content: string;
metadata: Record<string, unknown>; // raw source fields, not used by bandit
features: Record<string, number | boolean>; // bandit-ready features
timestamp: string;
}
interface SignalSource {
readonly id: string;
fetchSignals(userId: string): Promise<Signal[]>;
act?(userId: string, signalId: string, action: string): Promise<void>;
}
```
`SignalAggregator` calls all registered sources in parallel, isolating failures per source.
`TodoistSignalSource` moves all Todoist-specific logic (fetch, 401 handling, cache, bus events) out of the recommender route.
The recommender maps `Signal[]``TipCandidate[]` via a thin adapter and registers action dispatch through the aggregator.
## Consequences
**Good:**
- Adding a new signal source is a single `aggregator.register(new MySource())` call.
- `TipCandidate.features` is now `Record<string, number | boolean>`, matching `Signal.features`. Sources control their own feature names; the bandit serialises them as-is.
- Source failures are isolated: a broken Google Calendar connector does not prevent Todoist signals from reaching the bandit.
- `act()` on the aggregator routes actions back to the owning source (e.g. marking a Todoist task done), replacing ad-hoc source-specific logic in the feedback handler.
**Trade-offs:**
- Feature names are no longer compile-time typed. Convention: sources document their feature keys in their class JSDoc. The Python bandit already treated features as an opaque dict.
- Each source is responsible for its own token lookup (DB access injected via module-level `db`). This is acceptable in a modular monolith; extract to a token vault interface if sources move to separate processes.
## Alternatives considered
**Typed feature schema per source kind** — rejected: would require union types across all sources and a discriminant on every consumer. The bandit doesn't benefit from TypeScript types at runtime.
**Aggregator holds tokens, passes to sources** — rejected: leaks auth concerns into the aggregator. Sources know their own auth requirements.