alvis/oO

Files

alvis 7f173f88d3 refactor: architecture revision — modular monolith, auth-commit, event protobuf, privacy-from-day-0

- ADR-0003: modular monolith for Phase 0 with documented extraction triggers
- ADR-0004: Auth.js + OIDC-shaped boundary; dedicated provider when mobile ships
- ADR-0005: protobuf for events, OpenAPI for HTTP, schema-registry CI gate
- New architecture docs: data-model, metrics (magic proxies), privacy (Phase-0 feature)
- Prime directives updated: privacy-as-feature, modular-by-package-deployable-by-stage
- Roadmap revised: Apple OAuth deferred to M1; web push in M1; k3s intermediate; tip-kind-aware UI
- PLAN updated: Phase-0 deletion endpoint, metrics baseline, compose profiles, import-boundary lint
- License decision in README (ARR with OSS plan in Phase 5)

2026-04-13 14:36:11 +00:00

4.0 KiB

Raw Blame History

Architecture overview

Guiding constraints

The recommendation decision is the hot path. Every architectural choice should shorten the distance between a new signal and a better tip.
Modularity lives in code boundaries. Deploy topology follows pressure, not anticipation (ADR-0003).
Python for ML, TypeScript for applications. Shared contracts regenerated from a single source of truth: OpenAPI for HTTP, protobuf for events (ADR-0005).
Privacy is a Phase-0 feature, not a Phase-5 compliance project (see privacy.md).

Modules

Module	Language	Responsibility	Owns data	Phase-0 process
`gateway`	TS	BFF for web/mobile; auth-check; fan-out	—	Node monolith
`auth`	TS	OAuth (Google; Apple in M1), sessions, JWT	identities, sessions	Node monolith
`profile`	TS	user profile, preferences, consents	profiles	Node monolith
`integrations`	TS	third-party connectors, token vault, signal fetch	credentials, cursors	Node monolith
`events`	TS	event-bus abstraction + durable log (M1)	signal store	Node monolith (in-proc emitter)
`recommender`	TS	orchestration: candidates → policy → tip; feedback sink	tip history	Node monolith
`notifier`	TS	push/email delivery, quiet hours, dedupe	delivery log	Node monolith (web push in M1)
`ml/serving`	Python	online scoring for policies/models	— (stateless)	separate process
`ml/pipelines`	Python	batch feature + training pipelines	feature store, models	separate (from M4)

Extraction from the monolith is triggered by language boundary, scaling hotspot, SLA divergence, team ownership, or regulatory isolation (ADR-0003). ml/serving is pre-extracted on language grounds.

Data boundaries

Each service owns its schema; no cross-service DB access. When recommender needs profile data, it calls profile (read model), not its DB.

Event flow

connector (integrations) ──emit──▶ events ──▶ feature pipelines (ml)
                                     │
                                     └──▶ recommender (context assembly)

User reactions (done / snooze / dismiss) are events too. They close the loop as rewards for bandit/RL policies.

Why these choices

Modular monolith + Python ML in Phase 0 to ship the walking skeleton fast without foreclosing decomposition (ADR-0003).
NATS JetStream over Kafka for Phase 1: lighter, single-binary, fits the "one VM" deployment. Swap to Kafka in Phase 4 if fan-out justifies it.
Postgres for OLTP; per-module schemas in dev; separate databases once modules extract.
FastAPI + Pydantic for ML serving — fast, typed, swappable runtime (ONNX, Triton) behind it.
Protobuf for event schemas with a schema registry (ADR-0005) — train/serve parity depends on this.
OpenAPI for HTTP; TS client auto-generated; Python pydantic hand-written while consumers are few.
Feast for feature store when we get there; homegrown adapter until then (Phase 1 seam).
MLflow for model registry; artifacts in MinIO/S3.
Auth.js embedded behind an OIDC-shaped boundary (ADR-0004). Swap to a standalone OIDC provider when mobile ships.
k3s as the first step beyond docker-compose — no "compose → full k8s" cliff.

Decision flow for a new tip

client ─► gateway ─► recommender
                       │
                       ├─► candidates:   integrations.fetchCandidates(user)  + advice.library
                       ├─► context:      FeatureAssembler(user, request)
                       ├─► policy:       PolicyRegistry.get(policyName).pick(candidates, context)
                       ├─► shadows:      run shadow policies in parallel, log their picks
                       └─► persist:      TipInstance{context_snapshot, policy, tip}
                       ◄─  tip

Feedback travels back the same path: POST /feedback → events.emit(feedback.reaction) → pipelines consume → bandit/model updated on next retrain.

4.0 KiB Raw Blame History