Files
oO/docs/adr/0006-admin-console-framework.md
alvis 85367aeaa0 feat: MLOps external services, AI stack planning, admin MLOps hub
Infrastructure:
- Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db
- infra/mlflow/basic_auth.ini for MLflow auth config
- Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git)
- Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow)

Admin panel:
- /admin/models: replace MLflow iframe with external link cards
- /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets)
- AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section

Docs & planning:
- README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases)
- README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown
- README: Phase 4 expanded with LLM MLOps items (#94-#97)
- CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do
- docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline
- ADR-0006: updated to reflect external services (path-based, not embedded)
- Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 08:20:44 +00:00

4.0 KiB

ADR-0006: Admin console framework — Next.js 15 + Tremor + shadcn/ui + embed specialist tools

Status

Accepted — 2026-04-15

Context

M1 ships a bandit-driven recommender, an event bus, and a live feedback loop. Without a cockpit to observe these systems, every model change ships blind. An admin console is needed to:

  1. Observe — DAU/WAU, tip outcomes, reaction rates, LinUCB arm stats, feature distributions
  2. Inspect — per-user identity, consents, integrations, reward history
  3. Act — revoke tokens, replay signals, reset a per-user bandit, promote a policy
  4. Audit — every operator action is logged

The team is two people. The stack is TypeScript/React/Tailwind. Any framework that forks the stack creates a context-switch tax and a second deployment surface.

Decision

App shell — apps/admin, Next.js 15, App Router

Same stack as apps/web. Reuses packages/shared-types, the Auth.js session cookie, and the API rewrite convention. Deployed at admin.o.alogins.net behind Caddy, port 3080 in dev.

UI libraries

Layer Library Reason
Charts / KPI Tremor Analytics-first React + Tailwind components (KPI cards, time-series, bar lists). Designed for dashboards, not bolted on.
CRUD primitives shadcn/ui Copy-paste Radix components; forms, dialogs, command palette. No version lock-in — code lives in-repo.
Heavy grids TanStack Table v8 Sortable / paginated / virtualized tables for events, users, tips.
Extra charts Recharts Fallback where Tremor falls short (histograms, distributions).

Specialized MLOps tooling runs as separate external services with their own auth, linked from the admin shell — not embedded or reimplemented:

  • MLflowhttps://o.alogins.net/mlflow — experiment tracking, model registry, artifact browser; own basic-auth for now; see M3 for SSO consolidation
  • Airflowhttps://o.alogins.net/airflow — batch pipeline orchestration, dataset management; own web-auth for now
  • Grafana panels/admin/infra (iframed panels) — infra metrics
  • Marimo notebooks → launch-out link from admin

The admin shell links to these services; clicking them opens a new tab. The /experiments and /models admin pages are hub pages with direct links to the relevant MLflow/Airflow views.

AuthZ

profile.role column on the users table (values: 'user' | 'admin'). First admin seeded via ADMIN_SEED_EMAIL env var at startup. Admin-only gate in Next.js middleware checks the session and the role returned by GET /api/user/me. Every write action through the admin API is appended to an admin_actions audit log.

Rejected alternatives

Option Rejected because
Retool / AppSmith Admin logic leaves the repo; weak analytics affordances
Streamlit / Gradio Python-first; splits the frontend stack; thin RBAC
React-admin / Refine.dev Strong CRUD scaffolding, analytics views feel bolted on
Superset / Metabase as the admin surface Excellent BI, poor operational writes; plan: adopt Superset in M4 for BI alongside batch pipelines

Consequences

  • One more Next.js app in the monorepo. Build/dev added to Turborepo.
  • Tremor + shadcn/ui are added as dependencies. shadcn components are copied into apps/admin/src/components/ui/ — no runtime version coupling.
  • MLflow (o.alogins.net/mlflow* → port 5000) and Airflow (o.alogins.net/airflow* → port 8080) are path-based routes in the existing o.alogins.net Caddy block, started via docker compose --profile mlops up.
  • Each service manages its own auth (MLflow: built-in basic-auth; Airflow: built-in web UI auth). M3 will consolidate both behind the shared OIDC provider.
  • The NEXT_PUBLIC_MLFLOW_URL and NEXT_PUBLIC_AIRFLOW_URL build args in Dockerfile.admin default to the production URLs; override for dev builds.
  • admin_actions audit log grows unboundedly — needs a retention policy before M4.