Files
oO/docs/adr/0006-admin-console-framework.md
alvis f8d66aa01f chore: remove Airflow completely from the stack
Drop all four Airflow containers (db, init, webserver, scheduler) from the
mlops compose profile, leaving MLflow as the sole mlops service. Remove
AIRFLOW_* env vars, config fields, health-check entries, DAG trigger code
in admin/bench routes, the airflow_dag_run_id schema column, Airflow nav
links and DAG-run links in the admin UI, the two Airflow DAG files
(bench_dag.py, sim_dag.py), and all related docs/ADR references.
Simulations now run exclusively via the subprocess path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 16:38:46 +00:00

62 lines
3.6 KiB
Markdown

# ADR-0006: Admin console framework — Next.js 15 + Tremor + shadcn/ui + embed specialist tools
## Status
Accepted — 2026-04-15
## Context
M1 ships a bandit-driven recommender, an event bus, and a live feedback loop. Without a cockpit to observe these systems, every model change ships blind. An admin console is needed to:
1. **Observe** — DAU/WAU, tip outcomes, reaction rates, LinUCB arm stats, feature distributions
2. **Inspect** — per-user identity, consents, integrations, reward history
3. **Act** — revoke tokens, replay signals, reset a per-user bandit, promote a policy
4. **Audit** — every operator action is logged
The team is two people. The stack is TypeScript/React/Tailwind. Any framework that forks the stack creates a context-switch tax and a second deployment surface.
## Decision
### App shell — `apps/admin`, Next.js 15, App Router
Same stack as `apps/web`. Reuses `packages/shared-types`, the Auth.js session cookie, and the API rewrite convention. Deployed at `admin.o.alogins.net` behind Caddy, port 3080 in dev.
### UI libraries
| Layer | Library | Reason |
|-------|---------|--------|
| Charts / KPI | **Tremor** | Analytics-first React + Tailwind components (KPI cards, time-series, bar lists). Designed for dashboards, not bolted on. |
| CRUD primitives | **shadcn/ui** | Copy-paste Radix components; forms, dialogs, command palette. No version lock-in — code lives in-repo. |
| Heavy grids | **TanStack Table v8** | Sortable / paginated / virtualized tables for events, users, tips. |
| Extra charts | **Recharts** | Fallback where Tremor falls short (histograms, distributions). |
### Link out, don't embed
Specialized MLOps tooling runs as **separate external services** with their own auth, linked from the admin shell — not embedded or reimplemented:
- **MLflow** → `https://o.alogins.net/mlflow` — experiment tracking, model registry, artifact browser; own basic-auth for now; see M3 for SSO consolidation
- **Grafana panels** → `/admin/infra` (iframed panels) — infra metrics
- **Marimo notebooks** → launch-out link from admin
The admin shell links to these services; clicking them opens a new tab.
### AuthZ
`profile.role` column on the `users` table (values: `'user'` | `'admin'`). First admin seeded via `ADMIN_SEED_EMAIL` env var at startup. Admin-only gate in Next.js middleware checks the session and the role returned by `GET /api/user/me`. Every write action through the admin API is appended to an `admin_actions` audit log.
### Rejected alternatives
| Option | Rejected because |
|--------|-----------------|
| Retool / AppSmith | Admin logic leaves the repo; weak analytics affordances |
| Streamlit / Gradio | Python-first; splits the frontend stack; thin RBAC |
| React-admin / Refine.dev | Strong CRUD scaffolding, analytics views feel bolted on |
| Superset / Metabase as the admin surface | Excellent BI, poor operational writes; plan: adopt Superset in M4 for BI alongside batch pipelines |
## Consequences
- One more Next.js app in the monorepo. Build/dev added to Turborepo.
- Tremor + shadcn/ui are added as dependencies. shadcn components are copied into `apps/admin/src/components/ui/` — no runtime version coupling.
- MLflow (`o.alogins.net/mlflow*` → port 5000) is a path-based route in the existing `o.alogins.net` Caddy block, started via `docker compose --profile mlops up`.
- MLflow manages its own auth (built-in basic-auth). M3 will consolidate behind the shared OIDC provider.
- The `NEXT_PUBLIC_MLFLOW_URL` build arg in `Dockerfile.admin` defaults to the production URL; override for dev builds.
- `admin_actions` audit log grows unboundedly — needs a retention policy before M4.