alvis/oO

Files

alvis 7f173f88d3 refactor: architecture revision — modular monolith, auth-commit, event protobuf, privacy-from-day-0

- ADR-0003: modular monolith for Phase 0 with documented extraction triggers
- ADR-0004: Auth.js + OIDC-shaped boundary; dedicated provider when mobile ships
- ADR-0005: protobuf for events, OpenAPI for HTTP, schema-registry CI gate
- New architecture docs: data-model, metrics (magic proxies), privacy (Phase-0 feature)
- Prime directives updated: privacy-as-feature, modular-by-package-deployable-by-stage
- Roadmap revised: Apple OAuth deferred to M1; web push in M1; k3s intermediate; tip-kind-aware UI
- PLAN updated: Phase-0 deletion endpoint, metrics baseline, compose profiles, import-boundary lint
- License decision in README (ARR with OSS plan in Phase 5)

2026-04-13 14:36:11 +00:00

2.6 KiB

Raw Blame History

ADR-0003: Modular monolith for Phase 0, extract when justified

Status

Accepted — 2026-04-13

Context

The initial architecture called for seven independently-deployable services on day one (gateway, auth, profile, integrations, recommender, events, notifier). For a team of ~3 streams with zero users, this is premature. Each service adds CI, deploy, DB, observability, and release-coordination overhead. It also slows the walking skeleton, which is the most important thing to ship.

Modularity — the thing we actually need — is a code-boundary property, not a process-boundary property. Well-bounded packages extract to services cheaply; poorly-bounded services rarely merge back.

Decision

Phase 0: one Node process bundles services/* as internal packages behind their HTTP contracts. ml/serving is a separate Python process (language boundary). Postgres + NATS complete the stack.
Directory layout under services/ is unchanged. Each module is a self-contained package with its own README, schema migrations, and public interface.
Communication between modules goes through the same HTTP or event contracts it will use post-extraction. In Phase 0 these are resolved in-process via a thin dispatcher; swapping to HTTP/NATS is a transport change, not an API change.
Extraction criteria (trigger a service split when any apply):
1. Language boundary (already true for ml/serving).
2. Scaling hotspot: the module's load curve diverges materially from the rest.
3. SLA divergence: the module needs stricter availability or latency than the monolith.
4. Team ownership: a dedicated team takes the module and wants independent releases.
5. Regulatory isolation: credentials/PII need tighter blast-radius control.
events/ is special: even inside the monolith we use an event-emitter abstraction whose production implementation is NATS JetStream. The async boundary matters for ML correctness; the process boundary doesn't.

Consequences

Faster Phase 0: one CI pipeline, one deploy, one observability config.
Cheap extraction: contracts are already HTTP/event-shaped.
Discipline required: no cross-module DB access, no reaching into another module's internals, even though it's physically possible. Enforced by lint/import rules.
Deploy story: docker-compose with two application containers (Node monolith + Python serving) until extraction begins. Compose profiles let devs bring up subsets.

Non-consequences

We are not monolith-forever. We fully expect integrations/ and recommender/ to extract once Phase 2+ traffic patterns justify it.
Frontend / mobile unaffected.

2.6 KiB Raw Blame History