chore: scaffold oO monorepo with architecture, roadmap, and module stubs

2026-04-13 14:19:56 +00:00
commit cf4c7a0eb4
36 changed files with 494 additions and 0 deletions
--- a/docs/adr/0001-monorepo-polyglot.md
+++ b/docs/adr/0001-monorepo-polyglot.md
@@ -0,0 +1,15 @@
+# ADR-0001: Polyglot monorepo, TS for apps, Python for ML
+
+## Status
+Accepted — 2026-04-13
+
+## Context
+We ship web and mobile clients, backend services, and ML training/serving. Splitting into many repos early creates cross-repo PRs for every contract change and hurts velocity.
+
+## Decision
+One monorepo, managed with pnpm workspaces for TS and uv/poetry for Python. Shared contracts live in `packages/shared-types` generated from OpenAPI. ML is Python; everything else is TS.
+
+## Consequences
+- One CI system, one versioning flow, atomic cross-service PRs.
+- Requires disciplined boundaries: services must still be independently deployable.
+- Tooling complexity: two package managers, two lint stacks. Acceptable given the ML/app split.
--- a/docs/adr/0002-recommender-contract.md
+++ b/docs/adr/0002-recommender-contract.md
@@ -0,0 +1,20 @@
+# ADR-0002: Recommender as the stable contract, policy as a plugin
+
+## Status
+Accepted — 2026-04-13
+
+## Context
+v0 picks a random Todoist task. v1+ will use a contextual bandit, then learned rankers, then collaborative signals. If the HTTP contract and the candidate-generation path are coupled to today's "random", every change is a migration.
+
+## Decision
+`recommender` exposes `POST /recommend` as the one stable contract. Internally it has three seams:
+1. **Candidate sources** — async functions that yield `TipCandidate`s from integrations, advice libraries, etc.
+2. **Context assembler** — pulls features (today: inline; later: feature store).
+3. **Policy** — `Policy.pick(candidates, context) → tip`. Registered by name; selected per-request by the experiments framework (Phase 4) or a static config (now).
+
+Swapping a policy never changes the contract or the client.
+
+## Consequences
+- v0 policy is `RandomPolicy`, trivially 50 lines.
+- v1 moves scoring to `ml/serving` behind the same `Policy` interface (`RemotePolicy` wrapper).
+- A/B is introduced without touching clients.
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -0,0 +1,44 @@
+# Architecture overview
+
+## Guiding constraints
+
+- The **recommendation decision** is the hot path. Every architectural choice should shorten the distance between a new signal and a better tip.
+- Services are small and independently deployable, but we do **not** multiply services for its own sake. Split by team-of-ownership and by data lifecycle.
+- Python for ML, TypeScript for applications, shared contracts regenerated from a single source of truth.
+
+## Services
+
+| Service | Language | Responsibility | Owns data |
+|---|---|---|---|
+| `gateway` | TS (Node) | BFF for web/mobile; auth-checking; request fan-out | — |
+| `auth` | TS | OAuth (Google, Apple), sessions, token issuance | identities, sessions |
+| `profile` | TS | user profile, preferences, consents | profiles |
+| `integrations` | TS | third-party connectors, token vault, signal fetch | credentials, cursors |
+| `events` | TS | event-bus ingress, normalization, durable log | signal store |
+| `recommender` | TS | orchestration: candidates → policy → tip; feedback sink | tip history |
+| `ml/serving` | Python | online scoring for policies/models | — (stateless) |
+| `ml/pipelines` | Python | batch feature + training pipelines | feature store, models |
+| `notifier` | TS | push/email delivery, quiet hours, dedupe | delivery log |
+
+## Data boundaries
+
+Each service owns its schema; no cross-service DB access. When `recommender` needs profile data, it calls `profile` (read model), not its DB.
+
+## Event flow
+
+```
+connector (integrations) ──emit──▶ events ──▶ feature pipelines (ml)
+                                     │
+                                     └──▶ recommender (context assembly)
+```
+
+User reactions (done / snooze / dismiss) are events too. They close the loop as rewards for bandit/RL policies.
+
+## Why these choices
+
+- **NATS JetStream** over Kafka for Phase 1: lighter, single-binary, fits the "one VM" deployment. Swap to Kafka in Phase 4.
+- **Postgres** everywhere for OLTP. Per-service schemas, not per-service instances in dev.
+- **FastAPI + Pydantic** for ML serving — fast, typed, swappable runtime (ONNX, Triton) behind it.
+- **Feast** for feature store when we get there; homegrown adapter until then (Phase 1 seam).
+- **MLflow** for model registry; artifacts in MinIO/S3.
+- **Auth.js or Ory** for identity — we will not write crypto.