Commit Graph

12 Commits

Author SHA1 Message Date
05f748159b chore: remove shadow policy machinery (ADR-0013 step 10)
Deletes shadowPolicies map, getShadowPolicies, setPolicyActive from
recommender.ts; removes /api/admin/policies routes from admin.ts; removes
getPolicies, togglePolicy, PolicyInfo from admin api.ts; removes the
policy toggle section from the ops page.

168 API tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 10:45:32 +00:00
f8d66aa01f chore: remove Airflow completely from the stack
Drop all four Airflow containers (db, init, webserver, scheduler) from the
mlops compose profile, leaving MLflow as the sole mlops service. Remove
AIRFLOW_* env vars, config fields, health-check entries, DAG trigger code
in admin/bench routes, the airflow_dag_run_id schema column, Airflow nav
links and DAG-run links in the admin UI, the two Airflow DAG files
(bench_dag.py, sim_dag.py), and all related docs/ADR references.
Simulations now run exclusively via the subprocess path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 16:38:46 +00:00
ce1c8bde57 fix(admin): simulations view-only + docs path in Docker (#109 #110)
- simulate/page.tsx: remove launch form — simulations are triggered via
  Airflow DAG, not the admin UI. Page now shows run history + links to
  Airflow and MLflow only (#109)
- docs.ts: use DOCS_ROOT env var (fallback: ../../docs for local dev) so
  the path works in Docker standalone where CWD is /app (#110)
- Dockerfile.admin: copy docs/ into the runner image at /app/docs and set
  DOCS_ROOT=/app/docs so listAllDocs() finds the files at runtime (#110)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 13:55:50 +00:00
c1f5fcb561 fix(admin): ops page — add section description, remove redundant footer (#107)
Adds a one-line purpose description under the Ops heading so it is clear
what the section is for (shadow policy toggles, signal replay, per-user
actions). Removes the duplicate "User-level actions" subsection whose
content is now covered by the header description.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 13:53:35 +00:00
bad1bb2cba feat(simulate): MLflow tracking, Airflow DAG integration, health checks for mlflow/airflow
- sim_runs schema: add judge_mode, n_policies, airflow_dag_run_id, mlflow_run_id columns
- admin health endpoint: add mlflow + airflow checks (Basic auth for Airflow API)
- admin nav: add Simulations page link; rename section label
- runner.py: optional MLflow experiment tracking; multi-policy support
- sim_dag.py: Airflow DAG for offline sim pipeline
- admin simulate page + API client methods for sim runs
- shared-types tsconfig: exclude test files from build

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 12:08:36 +00:00
e96ceb7ee1 feat(auth): token-based admin authentication for Playwright/CI (#105)
Add POST /api/auth/token — validates ADMIN_TOKEN env var, creates a 24h
session and sets the sid cookie so automated tools can access the admin
panel without Google OAuth. Admin login page gains a token input form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 12:07:43 +00:00
4a42a6aabf feat(admin): profile freshness panel in data-quality (#81 phase B.4)
Adds a per-feature freshness summary to /admin/data-quality so the admin
can spot features that are systematically stale or never computed:

  totalEligible — distinct users with tip_views in the last 30 days
  missing       — eligible users with no row stored for the feature
  stale         — eligible users whose stored row is past its TTL

Backend exposes summarizeProfileFreshness() in profile/builder.ts; one
query per feature joins eligible users LEFT JOIN profile rows.
Coverage = (eligible − missing − stale) / eligible, colored
green/yellow/red via the new PctGood helper (high-is-good, opposite of
the existing Pct used for missing-feature/stale-token rates).

Refs #81.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 00:34:46 +00:00
aa4bdd8f09 feat(admin): LLM tip quality dashboard — per-model/prompt/kind breakdowns
/admin/reward-analytics now surfaces served count, reaction rate, and avg
reward grouped by llm_model, prompt_version, and tip_kind — closing the
loop so model/prompt iterations in M2 are legible next to the bandit
policy view. Data comes from the tip_scores columns added in ffdf707 and
tip_feedback.reward_milli; bandit-only tips show as "(bandit-only)".

Closes #92.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 15:24:52 +00:00
bb879c5f0f refactor(admin): drop simulations/experiments/models pages; group nav into sections
Removes the in-shell MLOps pages (experiments, models, simulations) and their
client API helpers in favour of external MLflow/Airflow links. Nav is regrouped
into Signals / Recommender status / Ops sections for clarity.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 14:41:17 +00:00
85367aeaa0 feat: MLOps external services, AI stack planning, admin MLOps hub
Infrastructure:
- Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db
- infra/mlflow/basic_auth.ini for MLflow auth config
- Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git)
- Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow)

Admin panel:
- /admin/models: replace MLflow iframe with external link cards
- /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets)
- AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section

Docs & planning:
- README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases)
- README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown
- README: Phase 4 expanded with LLM MLOps items (#94-#97)
- CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do
- docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline
- ADR-0006: updated to reflect external services (path-based, not embedded)
- Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 08:20:44 +00:00
faf44c18fc feat: ε-greedy v1 as active policy; dwell-time reward inference; offline sim framework
- Promote egreedy-v1 to active serving policy (ADR-0007): /score/egreedy + /reward/egreedy
  replaces linucb-v1 endpoints after offline sim shows +10.7% mean reward (−0.548 vs −0.606)
- Replace explicit helpful/not_helpful feedback with dwell-time inferred reward (inferReward):
  dismiss=−1.0, snooze=+0.1, done<15s=−0.3, done 15s–2min=+1.0, done 2–10min=+0.6, done>10min=+0.3
- Add ml/serving ε-greedy endpoints: /score/egreedy, /reward/egreedy, /stats/egreedy/{user_id}
  with d=7 feature vector (base 5 + sin/cos day-of-week encoding)
- Add offline simulation framework (ml/experiments/sim): rule/LLM/claude-code judges,
  two-phase score+reward, synthetic personas, task generator; results stored in sim_runs/sim_events
- Add /admin/simulations page: start runs, live-poll status, reward curve SVG, action/persona tables
- Fix egreedy day_of_week training skew: reward endpoint now uses actual dow instead of hardcoded 0
- Fix runner.py proxy bypass: httpx.Client(trust_env=False) for localhost ML calls
- Add dwellMs to TipFeedbackEvent contract and bus.test.ts fixture
- Schema: sim_runs, sim_events tables; tip_feedback gains dwell_ms, reward_milli columns
- ADR-0006: admin console framework; ADR-0007: egreedy-v1 policy selection rationale

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:44:37 +00:00
e62c726ea4 feat: M1 admin console — all 10 remaining pages + signal/quality/ops infrastructure
Admin console (issues #63–72):
- Event stream viewer: live-tail ring buffer (500 events) with subject/user filters
- Feature store browser: per-user feature vector history from ml/serving
- Model registry panel: MLflow embed at /admin/models
- Experiment dashboard: LinUCB per-user stats (pulls, reward, θ) + bandit reset
- Recommendation log: per-tip explainability (policy, score, features, latency)
- Reward analytics: daily reaction breakdown + per-policy compare
- Data quality widget: missing-feature rate, stale-token rate, daily completeness
- Ops actions: replay-signal, policy enable/disable; user actions link to Users page
- SQL runner: read-only SELECT runner with saved queries
- Health rollup: fan-out to api/ml/sqlite/event-bus with auto-refresh

Backend:
- tip_scores table: logs features+policy+score+latency at every scoring call (#67)
- saved_queries table: per-admin saved SQL (#71)
- Event bus: 500-event ring buffer + tail() API (#63)
- Admin routes: /events, /tips, /reward-analytics, /data-quality, /health,
  /policies, /replay-signal, /sql, /saved-queries endpoints
- /api/ml/* admin-gated proxy to ml/serving (#64, #66)
- Shadow-policy registry in recommender (#56)

ML serving:
- /reset/{user_id}: clear bandit state + feature history (#66)
- /stats/{user_id}: pulls, cumulative reward, estimated mean, θ (#66)
- /features/{user_id}: last 100 feature vectors logged at scoring time (#64)
- Meta (pulls, rewards) persisted alongside A/b matrices

Web:
- Tip action sheet adds Helpful / Not helpful buttons (#62)
- TipFeedback type extended with helpful/not_helpful actions
- Rewards mapped: helpful=+0.5, not_helpful=−0.5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 03:56:48 +00:00