Drop all four Airflow containers (db, init, webserver, scheduler) from the mlops compose profile, leaving MLflow as the sole mlops service. Remove AIRFLOW_* env vars, config fields, health-check entries, DAG trigger code in admin/bench routes, the airflow_dag_run_id schema column, Airflow nav links and DAG-run links in the admin UI, the two Airflow DAG files (bench_dag.py, sim_dag.py), and all related docs/ADR references. Simulations now run exclusively via the subprocess path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
99 lines
4.8 KiB
Markdown
99 lines
4.8 KiB
Markdown
# services/api
|
|
|
|
Express BFF that serves all client-facing routes, manages sessions, runs background signal sync, and proxies admin calls to `ml/serving`.
|
|
|
|
## Contract
|
|
|
|
```
|
|
GET /health { ok: true }
|
|
|
|
POST /api/auth/login → redirect to Google OAuth
|
|
GET /api/auth/callback OAuth return URL
|
|
POST /api/auth/logout
|
|
GET /api/auth/session → { user? }
|
|
POST /api/auth/token { token } → set sid cookie (ADMIN_TOKEN auth)
|
|
|
|
GET /api/integrations list connected integrations
|
|
POST /api/integrations/todoist/connect start Todoist OAuth
|
|
GET /api/integrations/todoist/callback
|
|
DELETE /api/integrations/:provider disconnect
|
|
|
|
POST /api/recommend → { tip }
|
|
POST /api/tip/:id/feedback { action } → { ok }
|
|
|
|
GET /api/user/profile
|
|
DELETE /api/user account deletion
|
|
|
|
POST /api/push/subscribe
|
|
DELETE /api/push/subscribe
|
|
|
|
GET /api/admin/stats DAU/WAU, feedback breakdown
|
|
GET /api/admin/users user list with pagination
|
|
GET /api/user/:id user detail, consents, integrations
|
|
GET /api/admin/events recent event stream (ring buffer or NATS JetStream)
|
|
GET /api/admin/events/history historical event query (time range, filters)
|
|
GET /api/admin/sim/runs offline sim run list
|
|
POST /api/admin/sim/run launch offline sim with policy/judge params
|
|
GET /api/admin/sim/runs/:id/output tail sim stdout
|
|
GET /api/admin/features/:userId per-user profile features + freshness
|
|
GET /api/admin/features/:userId/context context features for last score call
|
|
POST /api/admin/policies list shadow policies + active policy
|
|
POST /api/admin/policies/:name/toggle enable/disable shadow policy
|
|
POST /api/admin/users/:id/actions revoke-integration, reset-bandit, rebuild-profile
|
|
GET /api/admin/health system health: api, ml/serving, db, bus, mlflow
|
|
GET /api/admin/docs admin documentation index
|
|
GET /api/ml/* admin-only proxy to ml/serving
|
|
```
|
|
|
|
## Middleware stack (request order)
|
|
|
|
1. `cors` — origin limited to `WEB_BASE_URL`
|
|
2. `tracingMiddleware` — reads or generates W3C `traceparent`; sets `req.traceId` + `req.traceparent`
|
|
3. `pinoHttp` — structured JSON request/response logs with `traceId` field; `/health` suppressed
|
|
4. `express.json()` / `cookieParser`
|
|
5. `sessionMiddleware` — validates `sid` cookie, attaches `req.userId`
|
|
|
|
## Observability
|
|
|
|
Logs are structured JSON via **pino**. Every line includes `traceId` (extracted from the incoming W3C `traceparent` header, or generated fresh). The same `traceparent` is forwarded on all outbound HTTP calls to `ml/serving` so traces correlate end-to-end.
|
|
|
|
Sentry error capture is active when `SENTRY_DSN` is set.
|
|
|
|
## Background tasks
|
|
|
|
- **Todoist sync scheduler** — runs every `TODOIST_SYNC_INTERVAL_MS` (default 15 min); starts 10 s after boot to avoid startup surge.
|
|
- **Retention purge** — deletes `tipScores` and `tipFeedback` rows older than 30 days; runs on boot and daily.
|
|
- **Profile TTL invalidation** — listens to `signals.task.synced` and `signals.tip.feedback` on the in-process Bus; invalidates cached user-level profile features so the next `/recommend` gets fresh values.
|
|
|
|
## Config
|
|
|
|
| Env var | Default | Description |
|
|
|---------|---------|-------------|
|
|
| `PORT` | `3001` | Listen port |
|
|
| `NODE_ENV` | `development` | Environment label |
|
|
| `DATABASE_PATH` | `./data/oo.db` | SQLite file |
|
|
| `SESSION_SECRET` | required | Cookie signing secret |
|
|
| `GOOGLE_CLIENT_ID/SECRET` | required | OAuth |
|
|
| `TODOIST_CLIENT_ID/SECRET` | required | OAuth |
|
|
| `API_BASE_URL` | `http://localhost:3001` | Self-referential redirect URI |
|
|
| `WEB_BASE_URL` | `http://localhost:3000` | CORS + post-login redirect |
|
|
| `ML_SERVING_URL` | `http://localhost:8000` | ml/serving base URL |
|
|
| `NATS_URL` | `` | NATS broker; empty = in-process bus only |
|
|
| `TODOIST_SYNC_INTERVAL_MS` | `900000` | Background sync cadence |
|
|
| `TIP_PROMPT_VERSION` | `` | Prompt variant(s) for `/generate` |
|
|
| `LOG_LEVEL` | `info` | pino log level |
|
|
| `SENTRY_DSN` | `` | Sentry DSN; empty = Sentry disabled |
|
|
| `VAPID_*` | | Web push keys |
|
|
| `ADMIN_TOKEN` | `` | Static token for service/Playwright admin auth; empty = disabled |
|
|
|
|
## Health story
|
|
|
|
`GET /health` returns `{ ok: true }`. No dependency checks — upstream deps (`ml/serving`, NATS) have their own health endpoints checked separately.
|
|
|
|
## Extraction criteria
|
|
|
|
Extract to its own host when:
|
|
- Auth session management needs a dedicated Redis/PG session store, **or**
|
|
- Background sync load (Todoist, future connectors) displaces API serving on the shared host, **or**
|
|
- Team boundary emerges between auth/BFF and recommender orchestration.
|