alvis/oO

Files

alvis 5b52c6bf40 test: cover NATS bridge + Todoist scheduler; ADR-0010

- bus.test.ts: 4 cases for the new onPublish hook contract
- nats.test.ts: stream creation idempotency + JSON publish bridge
- scheduler.test.ts: startup delay, fan-out, per-user failure isolation
- ADR-0010 documents the bridge-don't-replace decision and the
  Todoist scheduler isolation, plus open follow-ups (#98 ml/serving
  consumer, #54 protobuf migration, graceful shutdown, metrics)
- README/overview/services README reflect the bridged event substrate
- CLAUDE.md gains a "don't nats.publish() directly" rule
- .env.example documents NATS_URL + TODOIST_SYNC_INTERVAL_MS

Verified in deployment 2026-04-18: api -> nats bridge connects on
boot, signals + feedback streams created, scheduler tick logs
"todoist sync: 1 ok, 0 failed (1 users)" within 10s. Closes #21, #22.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-18 07:55:25 +00:00

5.4 KiB

Raw Blame History

Architecture overview

Guiding constraints

The recommendation decision is the hot path. Every architectural choice should shorten the distance between a new signal and a better tip.
Modularity lives in code boundaries. Deploy topology follows pressure, not anticipation (ADR-0003).
Python for ML, TypeScript for applications. Shared contracts regenerated from a single source of truth: OpenAPI for HTTP, protobuf for events (ADR-0005).
Privacy is a Phase-0 feature, not a Phase-5 compliance project (see privacy.md).

Modules

Module	Language	Responsibility	Owns data	Phase-0 process
`gateway`	TS	BFF for web/mobile; auth-check; fan-out	—	Node monolith
`auth`	TS	OAuth (Google; Apple in M1), sessions, JWT	identities, sessions	Node monolith
`profile`	TS	user profile, preferences, consents	profiles	Node monolith
`integrations`	TS	third-party connectors, token vault, signal fetch	credentials, cursors	Node monolith
`events`	TS	event-bus abstraction + durable log	signal store	Node monolith (in-proc emitter, bridges to NATS JetStream when `NATS_URL` set)
`recommender`	TS	orchestration: candidates → policy → tip; feedback sink	tip history	Node monolith
`notifier`	TS	push/email delivery, quiet hours, dedupe	delivery log	Node monolith (web push in M1)
`ml/serving`	Python	online scoring for policies/models	— (stateless)	separate process
`ml/pipelines`	Python	batch feature + training pipelines	feature store, models	separate (from M4)

Extraction from the monolith is triggered by language boundary, scaling hotspot, SLA divergence, team ownership, or regulatory isolation (ADR-0003). ml/serving is pre-extracted on language grounds.

Data boundaries

Each service owns its schema; no cross-service DB access. When recommender needs profile data, it calls profile (read model), not its DB.

Event flow

connector (integrations) ──emit──▶ events ──▶ feature pipelines (ml)
                                     │
                                     └──▶ recommender (context assembly)

User reactions (done / snooze / dismiss) are events too. They close the loop as rewards for bandit/RL policies.

Why these choices

Modular monolith + Python ML in Phase 0 to ship the walking skeleton fast without foreclosing decomposition (ADR-0003).
NATS JetStream over Kafka for Phase 1: lighter, single-binary, fits the "one VM" deployment. Swap to Kafka in Phase 4 if fan-out justifies it.
Postgres for OLTP; per-module schemas in dev; separate databases once modules extract.
FastAPI + Pydantic for ML serving — fast, typed, swappable runtime (ONNX, Triton) behind it.
Protobuf for event schemas with a schema registry (ADR-0005) — train/serve parity depends on this.
OpenAPI for HTTP; TS client auto-generated; Python pydantic hand-written while consumers are few.
Feast for feature store when we get there; homegrown adapter until then (Phase 1 seam).
MLflow for model registry and experiment tracking; deployed at o.alogins.net/mlflow.
Airflow for batch pipelines; deployed at o.alogins.net/airflow.
Auth.js embedded behind an OIDC-shaped boundary (ADR-0004). Swap to a standalone OIDC provider when mobile ships.
k3s as the first step beyond docker-compose — no "compose → full k8s" cliff.

AI stack

All LLM inference routes through LiteLLM (llm.alogins.net) backed by Ollama (local, localhost:11434). This means:

Model aliases (tip-generator, embedder, judge) decouple code from model names.
Swapping qwen2.5 → llama3.2 = one-line config change in LiteLLM, zero code change in oO.
Cloud fallback (Anthropic) is opt-in and gated behind ANTHROPIC_API_KEY — used only in offline simulation.

OpenWebUI (ai.alogins.net) is the human-facing interface for prompt iteration and model testing during development.

Decision flow for a new tip (Phase 2 target)

client ─► gateway ─► recommender (TS)
                          │
                          ▼
                     ml/serving (Python)
                          │
                          ├─► context:    ml/features/context.py
                          │               (tasks + reactions + time patterns → prompt)
                          │
                          ├─► generate:   LiteLLM → Ollama
                          │               → N TipCandidates {content, kind, model, prompt_version}
                          │
                          ├─► score:      bandit policy scores each candidate
                          │
                          ├─► shadows:    shadow policies log picks without serving
                          │
                          └─► persist:    tip_scores {candidate, policy, features, latency}
                          ◄─  best TipCandidate

Phase 1 (shipped M1): candidates come from Todoist task list, no LLM. The bandit scores tasks directly.

Phase 2 (shipped M2): LLM candidates are generated in parallel with Todoist fetch. Both pools are merged, scored by the bandit, and the winner served. tip_scores tracks prompt_version, llm_model, and tip_kind for every row.

Feedback: POST /feedback → events.emit(reaction) → online bandit update + prompt_version tracked for A/B analysis.

5.4 KiB Raw Blame History