feat: MLOps external services, AI stack planning, admin MLOps hub
Infrastructure: - Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db - infra/mlflow/basic_auth.ini for MLflow auth config - Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git) - Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow) Admin panel: - /admin/models: replace MLflow iframe with external link cards - /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets) - AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section Docs & planning: - README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases) - README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown - README: Phase 4 expanded with LLM MLOps items (#94-#97) - CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do - docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline - ADR-0006: updated to reflect external services (path-based, not embedded) - Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
33
CLAUDE.md
33
CLAUDE.md
@@ -65,7 +65,7 @@ docs/ architecture notes, ADRs, API specs
|
||||
- One PR = one concern. Conventional-commit prefixes (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`).
|
||||
- ADRs go in `docs/adr/NNNN-title.md` for any decision that constrains future work.
|
||||
- No secrets in repo. Local dev via `.env.local` (gitignored), prod via the server's secret store (Vaultwarden now; k8s secrets later).
|
||||
- Compose profiles (`core`, `full`) so devs can run a subset without 16 GB of RAM.
|
||||
- Compose profiles: `core` (api + web + admin), `full` (adds ml-serving), `mlops` (adds MLflow + Airflow), `ai` (adds Ollama + LiteLLM). Mix as needed.
|
||||
|
||||
## Definition of done (per feature)
|
||||
|
||||
@@ -76,15 +76,38 @@ docs/ architecture notes, ADRs, API specs
|
||||
5. Deployable via `docker compose up` locally.
|
||||
6. If it touches user data → a deletion path exists and is tested.
|
||||
|
||||
## AI stack
|
||||
|
||||
oO generates tips with an LLM and ranks them with a bandit. All LLM calls route through **LiteLLM** at `llm.alogins.net` using model aliases — swapping models is a config change, not a code change.
|
||||
|
||||
| Alias | Model | Used by |
|
||||
|-------|-------|---------|
|
||||
| `tip-generator` | qwen2.5:7b (default) | `ml/serving` tip generation |
|
||||
| `embedder` | nomic-embed-text | task clustering, dedup |
|
||||
| `judge` | claude-haiku-4-5 (cloud, eval only) | offline sim |
|
||||
|
||||
Env vars: `LITELLM_URL` (default `http://localhost:4000`), `OLLAMA_URL` (default `http://localhost:11434`).
|
||||
|
||||
Start with: `docker compose --profile ai up` (adds Ollama + LiteLLM locally). In prod both are shared Agap services.
|
||||
|
||||
**LLM tip generation pipeline:**
|
||||
1. `ml/features/context.py` assembles user signals → structured prompt context
|
||||
2. `POST /generate` in `ml/serving` calls LiteLLM → returns `TipCandidate[]`
|
||||
3. Bandit policy in `ml/serving` scores + ranks candidates
|
||||
4. Best candidate returned as tip; reaction closes the online reward loop
|
||||
|
||||
## Current phase
|
||||
|
||||
**Phase 0 — Prototype.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.
|
||||
**M1 shipped. M2 (AI tips) in progress.** See `README.md` for the phase roadmap and `docs/architecture/` for diagrams. Work is tracked as Gitea milestones + issues on `alvis/oO`.
|
||||
|
||||
Active work: AI tip generation pipeline — issues #86–#93 in M2 milestone.
|
||||
|
||||
## What NOT to do
|
||||
|
||||
- Don't copy Todoist's data into our DB. Store the OAuth token + computed features/derivatives we need, fetch raw on demand.
|
||||
- Don't implement auth by hand. Phase 0 uses **Auth.js** behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
|
||||
- Don't hardwire a recommender. The "random todo" v0 must live behind the same interface the real ML model will implement (`POST /recommend` → `{tip}`). Swap internals, keep contract.
|
||||
- Don't implement auth by hand. Auth.js behind an OIDC-shaped boundary (ADR-0004); swap to a dedicated OIDC provider only when mobile ships.
|
||||
- Don't hardwire a recommender. The contract is `POST /recommend → {tip}`. Swap internals (bandit, LLM, hybrid), keep contract.
|
||||
- Don't replace a policy in one step. New policies deploy shadow-first; promoted only after offline + online agreement with the incumbent (ADR-0002).
|
||||
- Don't build an admin UI before the user-facing black page is polished.
|
||||
- Don't over-split processes. Extract a service when pressure demands it, not in anticipation (ADR-0003).
|
||||
- Don't call LLMs directly from application code. All LLM calls go through `ml/serving` (Python) via `LITELLM_URL`. The TS recommender never holds a model name.
|
||||
- Don't embed MLflow/Airflow/OpenWebUI in the admin panel. They are external services; link out to them. The admin shell links to `o.alogins.net/mlflow`, `/airflow`, `ai.alogins.net`.
|
||||
|
||||
Reference in New Issue
Block a user