Files
oO/docs/adr/0009-signal-normalization.md
alvis e3ca3ba733 feat: SignalSource abstraction — generalize signal ingestion beyond Todoist (#78)
- Add Signal + SignalSource interfaces to packages/shared-types
- TipCandidate.features widened to Record<string,number|boolean> to match Signal
- TodoistSignalSource: encapsulates fetch, cache, 401 handling, bus events, and act()
- SignalAggregator: parallel fan-out across sources with per-source failure isolation
- Recommender refactored to consume Signal[] via aggregator; source action dispatch via aggregator.act()
- ADR-0009: signal normalization strategy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 01:11:56 +00:00

2.6 KiB

ADR-0009 — Signal normalization strategy

Status: Accepted
Date: 2026-04-18
Issue: #78

Context

The recommender was hard-wired to Todoist: task fetch, cache, and feature extraction lived inside recommender.ts with no abstraction boundary. Adding Google Calendar, Apple Health, or manual input sources would have required forking the pipeline per source.

Decision

Introduce two abstractions in packages/shared-types:

interface Signal {
  id: string;
  source: string;
  kind: 'task' | 'event' | 'habit' | 'insight';
  content: string;
  metadata: Record<string, unknown>;   // raw source fields, not used by bandit
  features: Record<string, number | boolean>; // bandit-ready features
  timestamp: string;
}

interface SignalSource {
  readonly id: string;
  fetchSignals(userId: string): Promise<Signal[]>;
  act?(userId: string, signalId: string, action: string): Promise<void>;
}

SignalAggregator calls all registered sources in parallel, isolating failures per source.
TodoistSignalSource moves all Todoist-specific logic (fetch, 401 handling, cache, bus events) out of the recommender route.
The recommender maps Signal[]TipCandidate[] via a thin adapter and registers action dispatch through the aggregator.

Consequences

Good:

  • Adding a new signal source is a single aggregator.register(new MySource()) call.
  • TipCandidate.features is now Record<string, number | boolean>, matching Signal.features. Sources control their own feature names; the bandit serialises them as-is.
  • Source failures are isolated: a broken Google Calendar connector does not prevent Todoist signals from reaching the bandit.
  • act() on the aggregator routes actions back to the owning source (e.g. marking a Todoist task done), replacing ad-hoc source-specific logic in the feedback handler.

Trade-offs:

  • Feature names are no longer compile-time typed. Convention: sources document their feature keys in their class JSDoc. The Python bandit already treated features as an opaque dict.
  • Each source is responsible for its own token lookup (DB access injected via module-level db). This is acceptable in a modular monolith; extract to a token vault interface if sources move to separate processes.

Alternatives considered

Typed feature schema per source kind — rejected: would require union types across all sources and a discriminant on every consumer. The bandit doesn't benefit from TypeScript types at runtime.

Aggregator holds tokens, passes to sources — rejected: leaks auth concerns into the aggregator. Sources know their own auth requirements.