Files
oO/services/api/README.md
alvis f8d66aa01f chore: remove Airflow completely from the stack
Drop all four Airflow containers (db, init, webserver, scheduler) from the
mlops compose profile, leaving MLflow as the sole mlops service. Remove
AIRFLOW_* env vars, config fields, health-check entries, DAG trigger code
in admin/bench routes, the airflow_dag_run_id schema column, Airflow nav
links and DAG-run links in the admin UI, the two Airflow DAG files
(bench_dag.py, sim_dag.py), and all related docs/ADR references.
Simulations now run exclusively via the subprocess path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 16:38:46 +00:00

99 lines
4.8 KiB
Markdown

# services/api
Express BFF that serves all client-facing routes, manages sessions, runs background signal sync, and proxies admin calls to `ml/serving`.
## Contract
```
GET /health { ok: true }
POST /api/auth/login → redirect to Google OAuth
GET /api/auth/callback OAuth return URL
POST /api/auth/logout
GET /api/auth/session → { user? }
POST /api/auth/token { token } → set sid cookie (ADMIN_TOKEN auth)
GET /api/integrations list connected integrations
POST /api/integrations/todoist/connect start Todoist OAuth
GET /api/integrations/todoist/callback
DELETE /api/integrations/:provider disconnect
POST /api/recommend → { tip }
POST /api/tip/:id/feedback { action } → { ok }
GET /api/user/profile
DELETE /api/user account deletion
POST /api/push/subscribe
DELETE /api/push/subscribe
GET /api/admin/stats DAU/WAU, feedback breakdown
GET /api/admin/users user list with pagination
GET /api/user/:id user detail, consents, integrations
GET /api/admin/events recent event stream (ring buffer or NATS JetStream)
GET /api/admin/events/history historical event query (time range, filters)
GET /api/admin/sim/runs offline sim run list
POST /api/admin/sim/run launch offline sim with policy/judge params
GET /api/admin/sim/runs/:id/output tail sim stdout
GET /api/admin/features/:userId per-user profile features + freshness
GET /api/admin/features/:userId/context context features for last score call
POST /api/admin/policies list shadow policies + active policy
POST /api/admin/policies/:name/toggle enable/disable shadow policy
POST /api/admin/users/:id/actions revoke-integration, reset-bandit, rebuild-profile
GET /api/admin/health system health: api, ml/serving, db, bus, mlflow
GET /api/admin/docs admin documentation index
GET /api/ml/* admin-only proxy to ml/serving
```
## Middleware stack (request order)
1. `cors` — origin limited to `WEB_BASE_URL`
2. `tracingMiddleware` — reads or generates W3C `traceparent`; sets `req.traceId` + `req.traceparent`
3. `pinoHttp` — structured JSON request/response logs with `traceId` field; `/health` suppressed
4. `express.json()` / `cookieParser`
5. `sessionMiddleware` — validates `sid` cookie, attaches `req.userId`
## Observability
Logs are structured JSON via **pino**. Every line includes `traceId` (extracted from the incoming W3C `traceparent` header, or generated fresh). The same `traceparent` is forwarded on all outbound HTTP calls to `ml/serving` so traces correlate end-to-end.
Sentry error capture is active when `SENTRY_DSN` is set.
## Background tasks
- **Todoist sync scheduler** — runs every `TODOIST_SYNC_INTERVAL_MS` (default 15 min); starts 10 s after boot to avoid startup surge.
- **Retention purge** — deletes `tipScores` and `tipFeedback` rows older than 30 days; runs on boot and daily.
- **Profile TTL invalidation** — listens to `signals.task.synced` and `signals.tip.feedback` on the in-process Bus; invalidates cached user-level profile features so the next `/recommend` gets fresh values.
## Config
| Env var | Default | Description |
|---------|---------|-------------|
| `PORT` | `3001` | Listen port |
| `NODE_ENV` | `development` | Environment label |
| `DATABASE_PATH` | `./data/oo.db` | SQLite file |
| `SESSION_SECRET` | required | Cookie signing secret |
| `GOOGLE_CLIENT_ID/SECRET` | required | OAuth |
| `TODOIST_CLIENT_ID/SECRET` | required | OAuth |
| `API_BASE_URL` | `http://localhost:3001` | Self-referential redirect URI |
| `WEB_BASE_URL` | `http://localhost:3000` | CORS + post-login redirect |
| `ML_SERVING_URL` | `http://localhost:8000` | ml/serving base URL |
| `NATS_URL` | `` | NATS broker; empty = in-process bus only |
| `TODOIST_SYNC_INTERVAL_MS` | `900000` | Background sync cadence |
| `TIP_PROMPT_VERSION` | `` | Prompt variant(s) for `/generate` |
| `LOG_LEVEL` | `info` | pino log level |
| `SENTRY_DSN` | `` | Sentry DSN; empty = Sentry disabled |
| `VAPID_*` | | Web push keys |
| `ADMIN_TOKEN` | `` | Static token for service/Playwright admin auth; empty = disabled |
## Health story
`GET /health` returns `{ ok: true }`. No dependency checks — upstream deps (`ml/serving`, NATS) have their own health endpoints checked separately.
## Extraction criteria
Extract to its own host when:
- Auth session management needs a dedicated Redis/PG session store, **or**
- Background sync load (Todoist, future connectors) displaces API serving on the shared host, **or**
- Team boundary emerges between auth/BFF and recommender orchestration.