refactor: architecture revision — modular monolith, auth-commit, event protobuf, privacy-from-day-0

- ADR-0003: modular monolith for Phase 0 with documented extraction triggers
- ADR-0004: Auth.js + OIDC-shaped boundary; dedicated provider when mobile ships
- ADR-0005: protobuf for events, OpenAPI for HTTP, schema-registry CI gate
- New architecture docs: data-model, metrics (magic proxies), privacy (Phase-0 feature)
- Prime directives updated: privacy-as-feature, modular-by-package-deployable-by-stage
- Roadmap revised: Apple OAuth deferred to M1; web push in M1; k3s intermediate; tip-kind-aware UI
- PLAN updated: Phase-0 deletion endpoint, metrics baseline, compose profiles, import-boundary lint
- License decision in README (ARR with OSS plan in Phase 5)
This commit is contained in:
2026-04-13 14:36:11 +00:00
parent cf4c7a0eb4
commit 7f173f88d3
13 changed files with 449 additions and 133 deletions

View File

@@ -8,66 +8,73 @@ The magic is the product. Precision + timing + minimalism. The UI shows a single
## Prime directives
1. **Modular, service-oriented from day one.** Even the prototype. We will scale to mobile (iOS/Android), many integrations, multi-tenant ML. Shortcuts that bake in a monolith are not acceptable.
2. **Recommendation engine is the core.** Every other service feeds it or renders its output. Design schemas, event contracts, and APIs with that in mind.
3. **Python owns ML.** Everything training, features, serving for models is Python (FastAPI + PyTorch/scikit + MLflow/feast). Application services are TypeScript (Node, Next.js) unless there's a reason.
1. **Modular by package, deployable by stage.** Contracts live at package boundaries from day one so extraction to a service is cheap. Deploy topology evolves with real pressure (team size, scaling hotspots, language boundaries), not with wishful architecture. Phase 0 = **modular monolith + Python ML sidecar**. See ADR-0003.
2. **Recommendation engine is the core.** Every other module feeds it or renders its output. Design schemas, event contracts, and APIs with that in mind.
3. **Python owns ML.** Training, features, online scoring are Python (FastAPI + PyTorch/scikit + MLflow/Feast). Application code is TypeScript (Node, Next.js) unless there's a reason.
4. **OAuth-first for identity and integrations.** Never ask users for passwords or raw API keys when a delegated-auth flow exists. Store provider tokens encrypted, refresh transparently.
5. **Feel-of-magic over feature count.** When in doubt, ship fewer things, polished.
5. **Privacy is a feature, not a phase.** Consent capture, token revocation, and account deletion exist from the first real user. Data minimization: store the token + derivatives we need, not the raw feed.
6. **Feel-of-magic over feature count.** When in doubt, ship fewer things, polished. The tip page is a watch face.
## Architecture (high level)
The tree below is **logical module structure**. Directory layout is stable; how many processes you deploy is a stage decision (ADR-0003).
```
apps/ user-facing clients
web/ Next.js PWA — the first shipped client
mobile-ios/ Swift/SwiftUI (Phase 3)
mobile-android/ Kotlin/Compose (Phase 3)
services/ backend microservices (each independently deployable)
gateway/ API gateway + BFF (GraphQL or tRPC)
services/ backend modules each owns a contract; may share a deployable
gateway/ BFF for clients; auth check; fan-out
auth/ OAuth (Google, Apple, ...), sessions, JWT issuance
profile/ user profile, preferences, consents
integrations/ third-party connectors (Todoist first); token vault
recommender/ Python; serves the "one best tip" decision
events/ event bus ingress (Kafka/NATS) + signal store
notifier/ push/email/web delivery of tips
integrations/ third-party connectors + token vault (Todoist first)
recommender/ orchestration: candidates → policy → tip; feedback sink
events/ event bus ingress + durable signal store
notifier/ push/email/web delivery (web push from Phase 1)
packages/ shared libraries
shared-types/ OpenAPI/proto-generated types
packages/ shared libraries (importable across services + apps)
shared-types/ HTTP types via OpenAPI; event types via protobuf (ADR-0005)
sdk-js/ client SDK used by web + mobile webviews
ui/ shared React components + design tokens
ml/ Python MLOps
pipelines/ training / batch feature pipelines (Airflow/Prefect)
features/ feature definitions (Feast-style)
registry/ model registry (MLflow) integration
experiments/ A/B testing framework + bandit policies
serving/ online inference service (FastAPI)
notebooks/ research only — not production
ml/ Python — separate deployable from day one
serving/ online scorer (FastAPI), called by recommender
features/ feature definitions + store adapter
pipelines/ batch feature + training DAGs (Prefect/Airflow)
registry/ MLflow model registry integration
experiments/ assignment + A/B + bandit policies
notebooks/ research only; never imported by production code
infra/ docker-compose, k8s manifests, terraform, CI
infra/ docker-compose (Phase 0), k3s/k8s (later), terraform, CI
docs/ architecture notes, ADRs, API specs
```
## Contracts between services
**Phase 0 deployables:** one Node process (`services/*` bundled via modular monolith) + one Python process (`ml/serving`, stubbed until M1) + Postgres + NATS. Services **extract to their own process** when a real reason appears: language boundary, scaling hotspot, team ownership, or SLA divergence. See ADR-0003.
- **Events** (Kafka/NATS) — source of truth for user signals. All integrations emit normalized events; the recommender reads them.
- **HTTP/gRPC** — synchronous request/response (gateway → services).
- **Shared schemas** live in `packages/shared-types`; generated from a single OpenAPI / proto source. Do not redefine types per service.
## Contracts between modules
- **HTTP** (OpenAPI, in `packages/shared-types/http/`) — synchronous request/response. In-process today; over the network once extracted. Signatures are identical.
- **Events** (Protocol Buffers, in `packages/shared-types/events/`) — durable signals + feedback. Today: in-process event emitter. Tomorrow: NATS JetStream. Schema registry enforced in CI (ADR-0005).
- Do not redefine types per module. Regenerate from `shared-types`.
## Conventions
- Every service ships a `README.md`, a `Dockerfile`, and a `/health` endpoint.
- One PR = one concern. Commits follow conventional-commit prefixes (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`).
- Each module ships a `README.md` describing its contract, its `/health` story, and its extraction criteria (when it should become its own process).
- One PR = one concern. Conventional-commit prefixes (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`).
- ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work.
- No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later).
- Compose profiles (`core`, `full`) so devs can run a subset without 16 GB of RAM.
## Definition of done (per feature)
1. Code + tests merged.
2. Service's `README.md` updated.
2. Module's `README.md` updated.
3. If it changes a contract → `shared-types` regenerated + consumers updated.
4. If it changes architecture → ADR added.
5. Deployable via `docker compose up` locally.
6. If it touches user data → a deletion path exists and is tested.
## Current phase
@@ -75,7 +82,9 @@ docs/ architecture notes, ADRs, API specs
## What NOT to do
- Don't copy Todoist's data into our DB. Store the OAuth token; fetch on demand.
- Don't implement auth by hand. Use a library (NextAuth / Auth.js, Ory, or Clerk-compatible). We will self-host.
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
- Don't implement auth by hand. Phase 0 uses **Auth.js** behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
- Don't hardwire a recommender. The "random todo" v0 must live behind the same interface the real ML model will implement (`POST /recommend``{tip}`). Swap internals, keep contract.
- Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
- Don't build an admin UI before the user-facing black page is polished.
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).

90
PLAN.md
View File

@@ -1,71 +1,85 @@
# Implementation plan
Step-by-step build order for Phase 0 (prototype) and the seams that make Phases 15 cheap.
Step-by-step build order for Phase 0 (walking skeleton) and the seams that make Phases 15 cheap.
The principle: **build the contracts first, stub the internals.** Every service should exist with a `/health` endpoint and a minimal real implementation of its interface before any service is "finished". This gives us an end-to-end walking skeleton from week one.
The principle: **build the contracts first, stub the internals.** Every module exposes its contract and a `/health` story before any module is "finished". End-to-end walking skeleton in the first week.
**Packaging reminder (ADR-0003):** Phase 0 is a modular monolith — one Node process bundles `services/*` behind their HTTP contracts, plus `ml/serving` as a separate Python process. Contracts are identical whether the call is in-process or over the wire.
---
## Stage 0 — Foundations (days 13)
1. **Monorepo tooling.** pnpm workspaces for JS/TS; uv or poetry for Python; turbo or nx for build graph; pre-commit (lint, typecheck, format).
2. **Docker Compose dev env.** Postgres, NATS, MinIO (S3), Mailhog, all services wired with hot-reload.
3. **CI skeleton** (Gitea Actions): lint → typecheck → unit test → build → publish images.
4. **Secrets convention.** `.env.example` per service; prod secrets injected by orchestrator.
5. **Shared types package.** OpenAPI source → generated TS + Python clients.
1. **Monorepo tooling.** pnpm workspaces for TS; uv for Python; turbo for build graph; pre-commit (eslint, prettier, ruff, mypy, typecheck).
2. **Docker Compose dev env** with profiles:
- `core` — Node monolith + `ml/serving` stub + Postgres.
- `full` — adds NATS, MinIO, MailHog. Needed from Stage 4 onward.
3. **CI skeleton** (Gitea Actions): lint → typecheck → unit → build → publish images. Schema-registry check for protobuf events (added in Phase 1, but pipeline stub now).
4. **Secrets convention.** `.env.example` per module; prod injected by orchestrator.
5. **Shared types.** OpenAPI for HTTP, protobuf for events (ADR-0005). Generate TS; Python pydantic models hand-written initially (few consumers).
6. **Import-boundary lint.** `eslint-plugin-boundaries` (or equivalent) prevents `services/integrations` from importing `services/recommender` internals. Contracts-only.
Deliverable: `docker compose up` brings a green dashboard of `/health` endpoints.
Exit: `docker compose --profile core up` brings a green dashboard of `/health` endpoints.
## Stage 1 — Identity & session (days 47)
1. `services/auth`: Google OAuth2 (PKCE), session cookies, short-lived JWTs, refresh rotation. Library-backed (Auth.js or Ory Kratos + Hydra) — we do not roll our own.
2. `services/profile`: minimal `User` record; created on first sign-in.
3. `apps/web` sign-in page; gateway verifies JWT.
1. `services/auth` module: Auth.js embedded in the Node monolith, Google provider only (Apple deferred). OIDC-shaped surface (ADR-0004): `/me`, `/logout`, JWKS, stub `/.well-known/openid-configuration`.
2. `services/profile` module: `User` row created on first sign-in; consent record captured with ToS/PP version hash.
3. `apps/web` sign-in page. Gateway (also in-process) verifies JWT.
4. **Deletion endpoint** (yes, already): `DELETE /me` — revokes sessions, flips `deleted_at`, emits `user.deletion_requested`.
Exit check: a user can sign in and fetch their own profile.
Exit: a user can sign in, see their profile, and delete their account; deletion is observable end-to-end even though there's no data to erase yet.
## Stage 2 — Integrations framework (days 812)
1. `services/integrations` with a **Connector** interface:
- `begin_oauth(user) → redirect_url`
- `finish_oauth(code, state) → StoredCredential`
- `fetch_signals(user, since) → Event[]`
2. **Token vault**: column-level encryption (libsodium), key from env or KMS.
3. **Todoist connector** as the first concrete implementation.
4. Web "Connect" page: list of connectors, button per connector, callback handling.
1. `services/integrations` module with a **Connector** interface:
- `beginOAuth(user) → {redirectUrl, state}`
- `finishOAuth(code, state) → StoredCredential`
- `fetchSignals(user, since?) → AsyncIterable<NormalizedEvent>`
- `act?(user, action) → void`
- `revoke(user) → void` — first-class; no revocation means no disconnect.
2. **Token vault**: libsodium sealed box, key from env/KMS. One row per `(user, provider)` with provider-specific `meta` (e.g. Todoist `sync_token`).
3. **Todoist connector**: OAuth2, Sync API incremental reads via `sync_token`, `act` to complete a task, `revoke` calls Todoist's token-revocation endpoint.
4. Web `/connect`: list of connectors, per-connector consent screen (scopes + retention), connect/disconnect.
Exit check: a user taps "Connect Todoist", completes the OAuth dance, and the integrations service can fetch their tasks on demand.
Exit: a user can connect and disconnect Todoist; disconnect revokes at Todoist and wipes local credentials.
## Stage 3 — Recommender contract (days 1316)
1. `services/recommender` exposes `POST /recommend {user_id, context} → {tip}`.
2. Policy interface (`Policy.pick(user, candidates, context) → tip`).
3. **`RandomPolicy` v0** — fetches candidates from `integrations` (Todoist tasks), returns one uniformly at random.
4. Tip shape is provider-agnostic: `{id, kind: "todo"|"advice", title, body, source, deep_link, meta}`.
5. `apps/web` tip page: full black, one tip centered, tap = mark done → callback fires to integrations (complete Todoist task) + emits a feedback event.
1. `services/recommender` module exposes `POST /recommend` and `POST /feedback`.
2. **Policy registry** keyed by name. **Candidate sources** registered independently; v0 source = `integrations.todoist.tasks`.
3. **`RandomPolicy` v0** — draws uniformly.
4. **Tip shape** provider-agnostic: `{id, kind: "todo"|"advice", title, body, source, deep_link, meta}`.
5. **`TipInstance` persisted** with `context_snapshot` — the features-seen-at-decision-time blob that makes offline replay possible later.
6. `apps/web` tip page:
- `kind=todo` → tap = done (calls `integrations.todoist.act(complete)`).
- `kind=advice` → tap = acknowledge; long-press = save.
- Snooze / dismiss via long-press menu regardless of kind.
- Every reaction emits a feedback event even though it's in-process today.
Exit check: three-page prototype works end-to-end for one user.
Exit: three-page prototype works end-to-end.
## Stage 4 — Hardening the prototype (days 1720)
## Stage 4 — Hardening (days 1720)
1. Error surfaces (Sentry), structured logs (pino / structlog), trace IDs across services.
2. Rate limits + retries on outbound API calls.
3. Integration tests: Playwright for the web flow, pact-style contract tests between services.
4. Deploy to a single VM via docker-compose + Caddy.
1. Observability: pino + structlog, Sentry per module, W3C traceparent across the monolith boundary and into `ml/serving`.
2. Rate limits, retries with jitter, and circuit breakers on outbound (Todoist, Google).
3. Integration tests: Playwright for the web flow (sign-in → connect → tip → delete). Contract tests between modules so the extractions later are safe.
4. **Metrics baseline wired** (`docs/architecture/metrics.md`): activation, first-tip reaction, dwell, snooze:dismiss ratio, D1 retention.
5. Deploy to a single VM via docker-compose + Caddy; Caddy auto-TLS; healthchecks wired to Caddy.
Exit check: Phase 0 milestone closed.
Exit: Phase 0 milestone closed; real users can be onboarded.
---
## Seams prepared for later phases (do not implement yet, but do not foreclose)
## Seams prepared for later phases (designed now, implemented later)
- **Event bus.** From day one, `integrations` and `recommender` speak through an async fn that today is an in-process call but will be NATS tomorrow. Keep the signature `(event: NormalizedEvent) → void`.
- **Feature store.** The recommender accepts a `context` blob; later, a feature service fills it. Do not inline feature lookups inside the policy.
- **Policy registry.** `PolicyFactory.get(name)` so A/B and bandit policies slot in without code changes to the gateway.
- **Python boundary.** Recommender is TS today, but its scoring function is isolated — moving to FastAPI in Phase 1 is a file move, not a refactor.
- **Event bus abstraction.** `emit(event)` / `subscribe(topic, handler)` today is in-process; the production implementation in Phase 1 is NATS JetStream. Callsites never change.
- **Feature assembler.** Recommender accepts a `context` blob from a `FeatureAssembler`; in Phase 0 it returns a hard-coded minimum; in Phase 1 it calls the feature store.
- **Shadow-policy hook.** The recommender already supports running N policies in shadow per request; v0 runs zero shadows but the hook exists.
- **Extraction-ready modules.** Every `services/*/` has a `serve.ts` that can be mounted in the monolith or booted standalone. Dockerfile targets both.
---
## Staffing assumption
Work is parallelizable across ~3 streams: **infra/platform**, **backend services**, **web app**. Each Gitea issue notes which stream and which phase (milestone) it belongs to.
Three parallel streams: **platform** (infra, CI, shared-types), **backend** (auth, profile, integrations, recommender), **web** (sign-in, connect, tip, PWA). `ml` joins in Phase 1. Each Gitea issue carries its stream label and milestone.

View File

@@ -69,48 +69,59 @@ docs/ architecture, adr, api
## Roadmap
### Phase 0 — Prototype *(M0)*
Goal: a single user can sign in, connect Todoist, and see one random Todoist task on a black page.
- [ ] Monorepo scaffold, CI skeleton, docker-compose dev env
- [ ] `auth` service with Google OAuth
- [ ] `integrations/todoist` OAuth2 flow + encrypted token vault
- [ ] `recommender` service with `RandomPolicy` (v0)
- [ ] `apps/web` — three pages (sign-in, connect, tip)
- [ ] Deploy to a single VM via docker-compose
### Phase 0 — Walking skeleton *(M0)*
Goal: a single user signs in with Google, connects Todoist, and sees one random Todoist task on a black page. Deletion works.
- [ ] Monorepo scaffold, CI skeleton, docker-compose dev env with `core`/`full` profiles
- [ ] `auth` on Auth.js with Google provider; OIDC-shaped boundary (ADR-0004)
- [ ] `integrations/todoist` OAuth2 flow + encrypted token vault + provider-side revocation
- [ ] `recommender` with `RandomPolicy`; stable `POST /recommend` contract
- [ ] `apps/web` — three pages (sign-in, connect, tip); PWA manifest; offline reaction queue
- [ ] ToS + Privacy Policy + consent capture on first sign-in
- [ ] Account-deletion endpoint: revokes providers, purges credentials, soft-deletes profile
- [ ] Metrics baseline: activation, first-tip reaction rate, dwell, retention (see `docs/architecture/metrics.md`)
- [ ] Deploy modular monolith + `ml/serving` stub to a single VM via docker-compose + Caddy
### Phase 1 — Real signal *(M1)*
Goal: the tip is picked, not drawn from a hat. Still Todoist-only.
- [ ] Event bus (NATS) + ingestion from Todoist sync API
- [ ] Feature store skeleton (Feast or homegrown) and the first five features (time-of-day, overdue count, task age, priority, project)
- [ ] `ml/serving` FastAPI scoring endpoint; `recommender` calls it
- [ ] `ContextualBanditPolicy` v1 (LinUCB) replacing `RandomPolicy`
- [ ] Tip feedback loop: user reactions (done / snooze / dismiss) become rewards
### Phase 1 — Real signal + in-the-moment delivery *(M1)*
Goal: tips are picked, not drawn from a hat — and they arrive at the right moment on the web.
- [ ] Event bus (NATS JetStream) with protobuf schemas (ADR-0005) + schema-registry CI gate
- [ ] Todoist event-driven sync (emit `signals.task.*`)
- [ ] Feature store skeleton + first five features (hour-of-day, overdue count, task age, priority, project)
- [ ] `ml/serving` FastAPI scorer; `RemotePolicy` wrapper in recommender
- [ ] **Global-then-personalize bandit**: pooled LinUCB over shared features, per-user residual when data allows
- [ ] Shadow-deploy infra: every new policy logs what it *would* have picked; promotion requires reward-parity
- [ ] Feedback loop: reactions → rewards; delayed rewards for tasks completed in Todoist directly
- [ ] **Web Push notifications** (VAPID) so the "magic" shows up without opening the app
- [ ] `notifier` (lite): web-push delivery, quiet-hours honoured, dedupe
- [ ] Apple OAuth added (deferred from M0)
### Phase 2 — Multi-source user profile *(M2)*
Goal: oO knows more than tasks.
- [ ] Integrations: Google Calendar, Apple Health (web import), generic webhook
### Phase 2 — Multi-source profile & trust *(M2)*
Goal: oO knows more than tasks, and users can see/control what we know.
- [ ] Integrations: Google Calendar, Apple Health (web import), generic webhook ingress
- [ ] Unified `Profile` model (identity, preferences, contexts, consents)
- [ ] Timing signals (location, idle, focus windows) via client-side probes
- [ ] Advice library (curated tips, not only todos) + mixing policy
- [ ] Timing signals (Page Visibility, Idle Detection, coarse location) — opt-in, transparent
- [ ] Advice library + mixing policy (todo vs advice vs ambient)
- [ ] User-facing data dashboard: what's stored, what's computed, export, delete-by-category
- [ ] Cost/usage observability
### Phase 3 — Mobile & notifications *(M3)*
### Phase 3 — Native mobile *(M3)*
- [ ] iOS app (SwiftUI) with APNs push
- [ ] Android app (Compose) with FCM push
- [ ] `notifier` service with quiet-hours + per-channel rate limits
- [ ] Rich notifications that deep-link to the tip page
- [ ] `notifier` gains APNs + FCM channels, per-device rate limits
- [ ] Migrate auth from Auth.js to dedicated OIDC provider (trigger from ADR-0004)
- [ ] Decide-and-deliver scheduler: per-user "is this tip worth interrupting now?" threshold
### Phase 4 — MLOps at scale *(M4)*
- [ ] Airflow/Prefect orchestrator for batch retrains
- [ ] MLflow model registry + shadow deploys
- [ ] Online `experiments` framework: A/B + multi-armed bandits as first-class
- [ ] Cohort analysis + cross-user collaborative features (opt-in)
- [ ] Model cards, fairness checks, drift monitoring
- [ ] Prefect/Airflow for batch feature materialization + retraining
- [ ] MLflow registry; shadow → A/B → launch pipeline as first-class
- [ ] Online experiments framework: deterministic assignment + bandit policies alongside fixed-split A/B
- [ ] Cross-user collaborative features (opt-in only); cohort slicing; fairness checks
- [ ] Drift monitoring (feature drift, prediction drift, reward drift); model cards per version
### Phase 5 — Production hardening *(M5)*
- [ ] SOC2-style controls, audit logging, token rotation
- [ ] k8s deploy + horizontal autoscaling
- [ ] Multi-region failover, PITR backups
- [ ] Public integration SDK so third parties can add sources
- [ ] Audit logging, rotation of provider tokens + internal signing keys
- [ ] **k3s** on existing VM, then k8s + HPA once multi-node justified (no cliff)
- [ ] Multi-region failover, Postgres PITR, event-bus mirroring
- [ ] Public integration SDK; sandbox tenancy for third-party connectors
- [ ] Billing + subscription tiers
---
@@ -123,4 +134,5 @@ Conventions and per-service guidance live in [`CLAUDE.md`](CLAUDE.md).
## License
TBD.
All rights reserved — 2026. Contact the owner for licensing inquiries.
(We'll switch to an OSS license for non-sensitive packages once the public SDK lands in Phase 5.)

View File

@@ -0,0 +1,31 @@
# ADR-0003: Modular monolith for Phase 0, extract when justified
## Status
Accepted — 2026-04-13
## Context
The initial architecture called for seven independently-deployable services on day one (gateway, auth, profile, integrations, recommender, events, notifier). For a team of ~3 streams with zero users, this is premature. Each service adds CI, deploy, DB, observability, and release-coordination overhead. It also slows the walking skeleton, which is the most important thing to ship.
Modularity — the thing we actually need — is a **code-boundary** property, not a **process-boundary** property. Well-bounded packages extract to services cheaply; poorly-bounded services rarely merge back.
## Decision
- **Phase 0:** one Node process bundles `services/*` as internal packages behind their HTTP contracts. `ml/serving` is a separate Python process (language boundary). Postgres + NATS complete the stack.
- **Directory layout** under `services/` is unchanged. Each module is a self-contained package with its own README, schema migrations, and public interface.
- **Communication** between modules goes through the same HTTP or event contracts it will use post-extraction. In Phase 0 these are resolved in-process via a thin dispatcher; swapping to HTTP/NATS is a transport change, not an API change.
- **Extraction criteria** (trigger a service split when any apply):
1. Language boundary (already true for `ml/serving`).
2. Scaling hotspot: the module's load curve diverges materially from the rest.
3. SLA divergence: the module needs stricter availability or latency than the monolith.
4. Team ownership: a dedicated team takes the module and wants independent releases.
5. Regulatory isolation: credentials/PII need tighter blast-radius control.
- **`events/` is special:** even inside the monolith we use an event-emitter abstraction whose production implementation is NATS JetStream. The async boundary matters for ML correctness; the process boundary doesn't.
## Consequences
- Faster Phase 0: one CI pipeline, one deploy, one observability config.
- Cheap extraction: contracts are already HTTP/event-shaped.
- Discipline required: no cross-module DB access, no reaching into another module's internals, even though it's physically possible. Enforced by lint/import rules.
- Deploy story: docker-compose with two application containers (Node monolith + Python serving) until extraction begins. Compose profiles let devs bring up subsets.
## Non-consequences
- We are **not** monolith-forever. We fully expect `integrations/` and `recommender/` to extract once Phase 2+ traffic patterns justify it.
- Frontend / mobile unaffected.

View File

@@ -0,0 +1,23 @@
# ADR-0004: Auth.js for Phase 0, dedicated OIDC provider when mobile ships
## Status
Accepted — 2026-04-13
## Context
We need Google (and later Apple) sign-in, session management, and JWTs other services can verify. Options considered:
- **Auth.js (NextAuth):** a library embedded in the Next.js web app. Fastest to ship. Tight coupling to the web runtime; awkward when a native mobile client also needs tokens.
- **Ory Kratos + Hydra:** a standalone, self-hosted identity + OIDC provider. Much more powerful. Operationally heavy for a prototype.
- **Roll our own:** not considered.
Mobile apps are Phase 3+. Phase 0 needs the cheapest credible option that does not box us in.
## Decision
- **Phase 0:** use **Auth.js** inside the web app. Google provider only (Apple deferred — paid dev account + extra domain setup).
- **Boundary:** from day one, the `auth` module exposes an **OIDC-shaped** HTTP surface (`/me`, `/logout`, JWT verification via public JWKS, `/.well-known/openid-configuration` stub). Other services verify JWTs against that surface, not against Auth.js internals. This means the day we replace the engine, only one module changes.
- **JWT strategy:** short-lived (10 min) access JWT, rotating refresh token in an HttpOnly cookie. JWT contains `sub`, `email`, `scope`, `sid`.
- **Trigger to migrate to Ory (or equivalent):** any of — (a) native mobile shipping, (b) a second client type that can't piggyback on Next.js sessions, (c) multi-tenant requirement.
## Consequences
- Ships in days, not weeks.
- The OIDC-shaped boundary means the migration is scoped, not scary.
- Slight duplication early: we maintain OIDC-surface code that Auth.js mostly handles internally. Worth it.

View File

@@ -0,0 +1,28 @@
# ADR-0005: Protocol Buffers for event schemas, OpenAPI for HTTP
## Status
Accepted — 2026-04-13
## Context
Two contract surfaces exist:
1. **HTTP** — synchronous, client ↔ server, human-readable debugging matters. OpenAPI is the default and generates decent TS clients.
2. **Events** — durable, fan-out to ML consumers, schema evolution critical. Feature pipelines trained on old schemas will silently misbehave when producers change a field.
Using OpenAPI for both means:
- Python pydantic generation is awkward and hand-maintained in practice.
- No wire-format discipline (JSON is loose).
- No central schema registry, so schema drift is undetected until a model regresses.
## Decision
- **HTTP** contracts: OpenAPI 3.1 in `packages/shared-types/http/`. Generate TS clients; hand-write Python pydantic models for ML consumers (few, and they're shallow).
- **Event** contracts: Protocol Buffers in `packages/shared-types/events/`. Generate TS and Python. All events carry an envelope: `{event_id, occurred_at, schema_version, producer, payload}`.
- **Schema registry:** lightweight self-hosted (buf.build Schema Registry OSS or a tiny registry in `events/`). CI check blocks breaking changes without a version bump.
- **Evolution rules:** additive only within a major version; `reserved` for removed fields; new `schema_version` for breaking changes; consumers advertise the versions they accept.
## Consequences
- One extra build step in `shared-types` (buf or protoc).
- Breaking event changes cost something — good; they should.
- ML pipelines can replay old events against new code with confidence.
## Non-consequences
- No gRPC. HTTP stays HTTP/JSON. Protobuf is only the wire format on the event bus.

View File

@@ -0,0 +1,87 @@
# Data model
Durable entities across modules. Per-module databases/schemas own these; cross-module access is only via the module's API.
## Core entities
```
User auth + profile
id (uuid)
created_at
email (from IdP)
preferred_name?
deleted_at? soft-delete for 30-day recovery; hard-delete after
IdentityLink auth
user_id
provider "google" | "apple"
provider_sub subject from IdP
created_at
Session auth
user_id
sid (uuid) in JWT
issued_at
expires_at
revoked_at?
Profile profile
user_id (pk)
timezone
quiet_hours jsonb: [{start,end,days}]
contexts jsonb: [{name,predicate}] introduced in Phase 2
consents jsonb: {integration: {read,write,retain_days}}
Credential integrations
user_id
provider "todoist" | "google_calendar" | ...
ciphertext sealed-box over {access, refresh, scopes, expires_at}
meta provider-specific (sync_token cursor for Todoist)
created_at
last_refreshed_at
revoked_at?
Event events
event_id (ulid)
user_id
schema_version
kind e.g. "signals.task.updated"
occurred_at
ingested_at
payload protobuf bytes
TipInstance recommender
tip_id (ulid)
user_id
policy_name "random" | "bandit.linucb" | "remote:v3"
policy_version
candidate_source "todoist" | "advice.library" | ...
context_snapshot jsonb: features seen at decision time
tip jsonb: {kind,title,body,source,deep_link,meta}
created_at
shown_at? set when the client reports render
reaction? "done" | "snooze" | "dismiss" | null
reacted_at?
delivery_id? fk if surfaced via notifier push
Delivery notifier
delivery_id
user_id
tip_id
channel "webpush" | "apns" | "fcm" | "email"
dispatched_at
delivered_at?
failure_reason?
```
## Foreign-key discipline
There are no cross-module FKs. Each module owns its tables. References by id are soft; consistency is maintained by events (user-deleted → every module cascades its own cleanup).
## Deletion
`User.deleted_at` set → a `user.deletion_requested` event goes out → each module soft-deletes its rows → after 30 days a scheduled job hard-deletes. Credentials are **revoked at the provider** (not just erased locally) on soft-delete. See `privacy.md`.
## Replay and reproducibility
`TipInstance.context_snapshot` captures the exact features that produced the decision. This is what lets offline replay re-score historical tips against a new policy without touching the feature store.

View File

@@ -0,0 +1,43 @@
# Metrics: measuring "magic"
We cannot build a product whose core promise is "feels like magic" without proxies for it. These are the metrics every change is measured against.
## North star
**Week-2 tip-reaction rate** — of users who saw a tip in week 1, what fraction reacted to *any* tip in week 2? Captures "did this become part of your life."
## Activation (single-session)
- **Time-to-first-tip** — sign-in → tip rendered. Target: ≤ 60 s on the happy path.
- **First-tip reaction rate** — fraction of users who interact (done/snooze/dismiss/save) with their very first tip. Target: > 50%.
## Engagement
- **Dwell-before-action** — seconds between tip render and first reaction. Too short = glance-away; too long = confused.
- **Done rate / (Done + Snooze + Dismiss)** — the quality proxy. Rising = tips feel on-target.
- **Snooze:Dismiss ratio** — high snooze = "good tip, wrong moment" (timing problem). High dismiss = "wrong tip entirely" (relevance problem). These point at different fixes.
- **Return cadence** — median inter-session gap. Stable-and-short > spiky.
## Retention
- D1, D7, D28 retention. Cohort-sliced by connected integrations.
- Churn signal: 7 days without a session.
## ML health (from M1)
- Policy latency p50/p95/p99 at the recommender boundary.
- Feature null-rate per feature, per user.
- Online/offline reward disagreement for shadowed policies.
- Bandit regret proxy: observed reward vs an oracle's best-possible on the same candidates.
## Privacy & trust
- Account-deletion completion time (target: < 24 h).
- Provider-revocation success rate on disconnect.
- Number of active credentials per user (low = healthy).
## How metrics become decisions
- **Per-change.** Any policy or UX change declares which metric it expects to move and by how much. Missing the target triggers a review, not an automatic rollback (humans judge).
- **Shadow > A/B > launch.** Policy changes ship in shadow first (log what it *would* have recommended); then A/B on live traffic; then launch once online reward estimate ≥ incumbent by a CI margin.
- **Dashboards before features.** If we cannot measure a feature's impact on the north-star metric, we defer the feature.

View File

@@ -3,22 +3,25 @@
## Guiding constraints
- The **recommendation decision** is the hot path. Every architectural choice should shorten the distance between a new signal and a better tip.
- Services are small and independently deployable, but we do **not** multiply services for its own sake. Split by team-of-ownership and by data lifecycle.
- Python for ML, TypeScript for applications, shared contracts regenerated from a single source of truth.
- Modularity lives in **code boundaries**. Deploy topology follows pressure, not anticipation (ADR-0003).
- Python for ML, TypeScript for applications. Shared contracts regenerated from a single source of truth: OpenAPI for HTTP, protobuf for events (ADR-0005).
- Privacy is a Phase-0 feature, not a Phase-5 compliance project (see `privacy.md`).
## Services
## Modules
| Service | Language | Responsibility | Owns data |
|---|---|---|---|
| `gateway` | TS (Node) | BFF for web/mobile; auth-checking; request fan-out | — |
| `auth` | TS | OAuth (Google, Apple), sessions, token issuance | identities, sessions |
| `profile` | TS | user profile, preferences, consents | profiles |
| `integrations` | TS | third-party connectors, token vault, signal fetch | credentials, cursors |
| `events` | TS | event-bus ingress, normalization, durable log | signal store |
| `recommender` | TS | orchestration: candidates → policy → tip; feedback sink | tip history |
| `ml/serving` | Python | online scoring for policies/models | — (stateless) |
| `ml/pipelines` | Python | batch feature + training pipelines | feature store, models |
| `notifier` | TS | push/email delivery, quiet hours, dedupe | delivery log |
| Module | Language | Responsibility | Owns data | Phase-0 process |
|---|---|---|---|---|
| `gateway` | TS | BFF for web/mobile; auth-check; fan-out | — | Node monolith |
| `auth` | TS | OAuth (Google; Apple in M1), sessions, JWT | identities, sessions | Node monolith |
| `profile` | TS | user profile, preferences, consents | profiles | Node monolith |
| `integrations` | TS | third-party connectors, token vault, signal fetch | credentials, cursors | Node monolith |
| `events` | TS | event-bus abstraction + durable log (M1) | signal store | Node monolith (in-proc emitter) |
| `recommender` | TS | orchestration: candidates → policy → tip; feedback sink | tip history | Node monolith |
| `notifier` | TS | push/email delivery, quiet hours, dedupe | delivery log | Node monolith (web push in M1) |
| `ml/serving` | Python | online scoring for policies/models | — (stateless) | **separate process** |
| `ml/pipelines` | Python | batch feature + training pipelines | feature store, models | separate (from M4) |
Extraction from the monolith is triggered by language boundary, scaling hotspot, SLA divergence, team ownership, or regulatory isolation (ADR-0003). `ml/serving` is pre-extracted on language grounds.
## Data boundaries
@@ -36,9 +39,28 @@ User reactions (done / snooze / dismiss) are events too. They close the loop as
## Why these choices
- **NATS JetStream** over Kafka for Phase 1: lighter, single-binary, fits the "one VM" deployment. Swap to Kafka in Phase 4.
- **Postgres** everywhere for OLTP. Per-service schemas, not per-service instances in dev.
- **Modular monolith + Python ML** in Phase 0 to ship the walking skeleton fast without foreclosing decomposition (ADR-0003).
- **NATS JetStream** over Kafka for Phase 1: lighter, single-binary, fits the "one VM" deployment. Swap to Kafka in Phase 4 if fan-out justifies it.
- **Postgres** for OLTP; per-module schemas in dev; separate databases once modules extract.
- **FastAPI + Pydantic** for ML serving — fast, typed, swappable runtime (ONNX, Triton) behind it.
- **Protobuf** for event schemas with a schema registry (ADR-0005) — train/serve parity depends on this.
- **OpenAPI** for HTTP; TS client auto-generated; Python pydantic hand-written while consumers are few.
- **Feast** for feature store when we get there; homegrown adapter until then (Phase 1 seam).
- **MLflow** for model registry; artifacts in MinIO/S3.
- **Auth.js or Ory** for identity — we will not write crypto.
- **Auth.js** embedded behind an OIDC-shaped boundary (ADR-0004). Swap to a standalone OIDC provider when mobile ships.
- **k3s** as the first step beyond docker-compose — no "compose → full k8s" cliff.
## Decision flow for a new tip
```
client ─► gateway ─► recommender
├─► candidates: integrations.fetchCandidates(user) + advice.library
├─► context: FeatureAssembler(user, request)
├─► policy: PolicyRegistry.get(policyName).pick(candidates, context)
├─► shadows: run shadow policies in parallel, log their picks
└─► persist: TipInstance{context_snapshot, policy, tip}
◄─ tip
```
Feedback travels back the same path: `POST /feedback → events.emit(feedback.reaction)` → pipelines consume → bandit/model updated on next retrain.

View File

@@ -0,0 +1,40 @@
# Privacy architecture
Privacy is a Phase 0 feature, not a Phase 5 compliance project. This doc is the minimum.
## Principles
1. **Data minimization.** Store only what we need for the tip. Raw task titles stay at Todoist; we store references + computed features. If a feature doesn't lift a metric, its input data doesn't get stored.
2. **User-visible controls.** Every connection shows exactly which scopes we hold and what we've computed. One tap disconnects and revokes.
3. **Deletion is real.** Deleting an account revokes provider tokens, purges credentials immediately, and soft-deletes user data for a 30-day recovery window, then hard-deletes.
4. **No surprise sharing.** Cross-user / collaborative features are opt-in, per category, per integration.
5. **Encryption in transit and at rest.** TLS everywhere; column-level encryption for credentials; disk-level for backups.
## Flows
### Connect
User taps "Connect Todoist" → consent screen lists: scopes requested, what we store, what we compute, retention, revocation instructions → OAuth → stored credential is immediately testable and shows in `/connect`.
### Disconnect
User taps disconnect → `Credential.revoked_at` set → provider-side revocation attempted (Todoist: token revocation endpoint) → credential erased on success → `credential.revoked` event → downstream modules drop associated cursors, caches, derived features for that `(user, provider)` pair.
### Delete account
User taps "Delete account" in settings → hard confirm → `User.deleted_at` set, all sessions revoked, `user.deletion_requested` event fanned out → every module processes its portion (credentials revoked + purged; profile scrubbed; tip history anonymized to aggregate stats only or purged, per retention policy; events purged on schedule) → within 24 hours account is non-recoverable operationally; within 30 days all rows are hard-deleted.
### Export (Phase 2)
`GET /me/export` returns a JSON bundle of everything we hold for the user: profile, consents, credentials-metadata (not secrets), events, tip history.
## Scope boundaries
Each integration declares the scopes it requests and the features it derives. The `Profile.consents` column is the source of truth; a scope removed from consent short-circuits derived-feature computation at the feature store.
## Audit
- Privileged actions (admin-initiated deletions, credential decryption outside the normal refresh path) go to an append-only audit log from Phase 0.
- Per-user access log available via `GET /me/access-log` (Phase 2).
## Legal surface (Phase 0 minimum)
- Terms of Service + Privacy Policy documents shipped alongside the sign-in page.
- Consent capture on first sign-in, with a versioned ToS/PP hash stored per user.
- Data-subject request inbox (email) wired up before onboarding the first external user.

View File

@@ -1,13 +1,15 @@
# services/
Backend microservices. Each directory is independently deployable, ships a `Dockerfile`, a `/health` endpoint, and its own `README.md` describing its contract.
Backend modules. Each owns a contract and ships its own `README.md`. In **Phase 0** these are internal packages inside a single Node process (ADR-0003); they extract to their own processes as pressure justifies.
| Dir | Role | Phase introduced |
|---|---|---|
| `gateway/` | BFF for clients; auth check; fan-out to services | 0 |
| `auth/` | OAuth (Google/Apple), sessions, JWT | 0 |
| `profile/` | user profile, preferences, consents | 0 |
| `integrations/` | third-party connectors + encrypted token vault (Todoist first) | 0 |
| `recommender/` | `POST /recommend` — policy-driven tip selection | 0 |
| `events/` | event bus ingress + durable signal store | 1 |
| `notifier/` | push/email/web delivery with quiet-hours | 3 |
| Dir | Role | Phase-0 shape | Extracts when |
|---|---|---|---|
| `gateway/` | BFF for clients; auth check; fan-out | in-proc router | never (stays as the edge) |
| `auth/` | Google OAuth (Apple in M1), sessions, JWT | Auth.js behind OIDC shape | mobile native ships (M3) |
| `profile/` | user profile, preferences, consents | in-proc module | team ownership diverges |
| `integrations/` | connectors + encrypted token vault | in-proc module | credential blast-radius isolation |
| `recommender/` | `POST /recommend` — policy-driven tip selection | in-proc; calls `ml/serving` from M1 | scaling hotspot |
| `events/` | event bus + signal log | in-proc emitter (Phase 0); NATS (M1) | always a library + broker, not a service |
| `notifier/` | push/email delivery + quiet hours | in-proc; **web push in M1** | SLA divergence or mobile push scale |
Contracts that cross module lines (HTTP or events) come from `packages/shared-types/`. In-module imports across modules are forbidden by import lint.

View File

@@ -7,11 +7,14 @@ Third-party connectors and the token vault.
```ts
interface Connector {
id: string // e.g. "todoist"
scopes: string[] // human-readable list shown in consent UI
beginOAuth(user): Promise<{ redirectUrl, state }>
finishOAuth(code, state): Promise<StoredCredential>
fetchSignals(user, since?): AsyncIterable<NormalizedEvent>
// optional write-back, e.g. mark task done
act?(user, action): Promise<void>
// incremental-sync cursor (Todoist sync_token, webhook timestamps, etc.)
// stored in Credential.meta; the connector owns its shape.
act?(user, action): Promise<void> // optional write-back (complete task, etc.)
revoke(user): Promise<void> // REQUIRED: provider-side token revocation on disconnect
}
```

View File

@@ -16,12 +16,14 @@ POST /feedback
## Internals (stable seams)
- **Candidate sources** — pluggable async generators. v0: Todoist tasks via `integrations`. Later: advice library, calendar nudges, health prompts.
- **Context assembler** — merges request context with features (inline now, feature-store later).
- **Policy** — `Policy.pick(candidates, context) → tip`. Registered by name:
- **Feature assembler** — fills the `context` blob (inline in Phase 0; calls feature store from M1). Never inlined into policy code.
- **Policy registry** — `Policy.pick(candidates, context) → tip`. Named entries:
- `random` — v0 (Phase 0).
- `bandit.linucb` — v1 (Phase 1).
- `bandit.linucb.pooled` — v1 (Phase 1). **Global-then-personalize**: pooled features shared across users; per-user residual once data allows.
- `remote` — delegates to `ml/serving` FastAPI scorer (Phase 1+).
- **Shadow hook** — every request optionally runs N shadow policies in parallel and logs their picks + estimated rewards. Promotion from shadow → A/B → launch is a separate, deliberate step (ADR-0002).
- **TipInstance persistence** — every decision writes `context_snapshot` (features seen at decision time). This is what makes offline replay honest.
## Phase 0 goal
`RandomPolicy` only. The service, contract, and seams exist; the brain does not yet.
`RandomPolicy` only. The service, contract, registry, shadow hook, and tip-instance persistence all exist; no ML yet.