New DAG (`ml/pipelines/bench_dag.py`) with three linked tasks: 1. collect.py — generates candidates, logs to MLflow 2. export_for_judge — exports pending runs for Claude Code scoring 3. compare — generates leaderboard by (model, prompt) cell Config via dag_run.conf supports all collect.py options (models, prompts, n_tips, n_scenarios, temperature, experiment name, max_model_b). New admin API endpoints (`services/api/src/routes/bench.ts`): - GET /api/bench/experiments — list tip-bench-* experiments - POST /api/bench/run — trigger DAG with custom config - GET /api/bench/runs/:experiment — list runs in experiment - GET /api/bench/leaderboard/:experiment — leaderboard by (model, prompt) All endpoints require admin auth. Human judge (Claude Code) scores are applied manually post-export; future enhancement: add webhook to DAG. Admin UI can now trigger and monitor benchmarks from a dashboard panel. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
services/
Backend modules. Each owns a contract and ships its own README.md. In Phase 0 these are internal packages inside a single Node process (ADR-0003); they extract to their own processes as pressure justifies.
| Dir | Role | Phase-0 shape | Extracts when |
|---|---|---|---|
gateway/ |
BFF for clients; auth check; fan-out | in-proc router | never (stays as the edge) |
auth/ |
Google OAuth (Apple in M1), sessions, JWT | Auth.js behind OIDC shape | mobile native ships (M3) |
profile/ |
user profile, preferences, consents | in-proc module | team ownership diverges |
integrations/ |
connectors + encrypted token vault | in-proc module | credential blast-radius isolation |
recommender/ |
POST /recommend — policy-driven tip selection |
in-proc; calls ml/serving from M1 |
scaling hotspot |
events/ |
event bus + signal log | in-proc emitter; bridges to NATS JetStream when NATS_URL set (ADR-0010) |
always a library + broker, not a service |
notifier/ |
push/email delivery + quiet hours | in-proc; web push in M1 | SLA divergence or mobile push scale |
Contracts that cross module lines (HTTP or events) come from packages/shared-types/. In-module imports across modules are forbidden by import lint.