9e96540bcc
feat(admin): per-user profile view + rebuild action ( #81 phase B.1)
...
Surfaces phase A's profile features in /admin/users/:id so we can verify
they're actually computing useful values before investing in bandit
consumption. The detail GET now includes profile rows joined with registry
metadata (name, value, age, fresh badge, ttlSec, description). Read does
NOT trigger compute — staleness must be visible. A new POST
.../profile/rebuild button force-recomputes and is audit-logged like
reset-bandit.
Refs #81 .
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com >
2026-04-25 00:27:08 +00:00
aa4bdd8f09
feat(admin): LLM tip quality dashboard — per-model/prompt/kind breakdowns
...
/admin/reward-analytics now surfaces served count, reaction rate, and avg
reward grouped by llm_model, prompt_version, and tip_kind — closing the
loop so model/prompt iterations in M2 are legible next to the bandit
policy view. Data comes from the tip_scores columns added in ffdf707 and
tip_feedback.reward_milli; bandit-only tips show as "(bandit-only)".
Closes #92 .
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com >
2026-04-24 15:24:52 +00:00
bb879c5f0f
refactor(admin): drop simulations/experiments/models pages; group nav into sections
...
Removes the in-shell MLOps pages (experiments, models, simulations) and their
client API helpers in favour of external MLflow/Airflow links. Nav is regrouped
into Signals / Recommender status / Ops sections for clarity.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com >
2026-04-18 14:41:17 +00:00
85367aeaa0
feat: MLOps external services, AI stack planning, admin MLOps hub
...
Infrastructure:
- Add `mlops` compose profile: MLflow (basic-auth, /mlflow path) + Airflow (LocalExecutor, /airflow path) + airflow-db
- infra/mlflow/basic_auth.ini for MLflow auth config
- Caddy routes /mlflow* and /airflow* inside existing o.alogins.net block (see agap_git)
- Dockerfile.admin: NEXT_PUBLIC_MLFLOW_URL / NEXT_PUBLIC_AIRFLOW_URL build args (default /mlflow, /airflow)
Admin panel:
- /admin/models: replace MLflow iframe with external link cards
- /admin/experiments: replace LinUCB stats with MLOps hub (links to MLflow experiments/models + Airflow DAGs/datasets)
- AdminShell: external nav links for MLflow ↗ and Airflow ↗ under MLOps section
Docs & planning:
- README: new AI stack section (Ollama/LiteLLM/OpenWebUI three-tier, tip generation pipeline, model aliases)
- README: Phase 2 expanded with AI infra issues (#86-#93) and granular pipeline breakdown
- README: Phase 4 expanded with LLM MLOps items (#94-#97)
- CLAUDE.md: AI stack section, updated current phase (M1 shipped / M2 in progress), compose profiles, updated What NOT to do
- docs/architecture/overview.md: AI stack section, updated decision flow diagram for Phase 2 LLM pipeline
- ADR-0006: updated to reflect external services (path-based, not embedded)
- Gitea issues #86-#97 created (M2: AI infra + pipeline; M4: LLM MLOps)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-17 08:20:44 +00:00
faf44c18fc
feat: ε-greedy v1 as active policy; dwell-time reward inference; offline sim framework
...
- Promote egreedy-v1 to active serving policy (ADR-0007): /score/egreedy + /reward/egreedy
replaces linucb-v1 endpoints after offline sim shows +10.7% mean reward (−0.548 vs −0.606)
- Replace explicit helpful/not_helpful feedback with dwell-time inferred reward (inferReward):
dismiss=−1.0, snooze=+0.1, done<15s=−0.3, done 15s–2min=+1.0, done 2–10min=+0.6, done>10min=+0.3
- Add ml/serving ε-greedy endpoints: /score/egreedy, /reward/egreedy, /stats/egreedy/{user_id}
with d=7 feature vector (base 5 + sin/cos day-of-week encoding)
- Add offline simulation framework (ml/experiments/sim): rule/LLM/claude-code judges,
two-phase score+reward, synthetic personas, task generator; results stored in sim_runs/sim_events
- Add /admin/simulations page: start runs, live-poll status, reward curve SVG, action/persona tables
- Fix egreedy day_of_week training skew: reward endpoint now uses actual dow instead of hardcoded 0
- Fix runner.py proxy bypass: httpx.Client(trust_env=False) for localhost ML calls
- Add dwellMs to TipFeedbackEvent contract and bus.test.ts fixture
- Schema: sim_runs, sim_events tables; tip_feedback gains dwell_ms, reward_milli columns
- ADR-0006: admin console framework; ADR-0007: egreedy-v1 policy selection rationale
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-16 07:44:37 +00:00
e62c726ea4
feat: M1 admin console — all 10 remaining pages + signal/quality/ops infrastructure
...
Admin console (issues #63–72):
- Event stream viewer: live-tail ring buffer (500 events) with subject/user filters
- Feature store browser: per-user feature vector history from ml/serving
- Model registry panel: MLflow embed at /admin/models
- Experiment dashboard: LinUCB per-user stats (pulls, reward, θ) + bandit reset
- Recommendation log: per-tip explainability (policy, score, features, latency)
- Reward analytics: daily reaction breakdown + per-policy compare
- Data quality widget: missing-feature rate, stale-token rate, daily completeness
- Ops actions: replay-signal, policy enable/disable; user actions link to Users page
- SQL runner: read-only SELECT runner with saved queries
- Health rollup: fan-out to api/ml/sqlite/event-bus with auto-refresh
Backend:
- tip_scores table: logs features+policy+score+latency at every scoring call (#67 )
- saved_queries table: per-admin saved SQL (#71 )
- Event bus: 500-event ring buffer + tail() API (#63 )
- Admin routes: /events, /tips, /reward-analytics, /data-quality, /health,
/policies, /replay-signal, /sql, /saved-queries endpoints
- /api/ml/* admin-gated proxy to ml/serving (#64 , #66 )
- Shadow-policy registry in recommender (#56 )
ML serving:
- /reset/{user_id}: clear bandit state + feature history (#66 )
- /stats/{user_id}: pulls, cumulative reward, estimated mean, θ (#66 )
- /features/{user_id}: last 100 feature vectors logged at scoring time (#64 )
- Meta (pulls, rewards) persisted alongside A/b matrices
Web:
- Tip action sheet adds Helpful / Not helpful buttons (#62 )
- TipFeedback type extended with helpful/not_helpful actions
- Rewards mapped: helpful=+0.5, not_helpful=−0.5
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-16 03:56:48 +00:00