aa4bdd8f09
feat(admin): LLM tip quality dashboard — per-model/prompt/kind breakdowns
...
/admin/reward-analytics now surfaces served count, reaction rate, and avg
reward grouped by llm_model, prompt_version, and tip_kind — closing the
loop so model/prompt iterations in M2 are legible next to the bandit
policy view. Data comes from the tip_scores columns added in ffdf707 and
tip_feedback.reward_milli; bandit-only tips show as "(bandit-only)".
Closes #92 .
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com >
2026-04-24 15:24:52 +00:00
e62c726ea4
feat: M1 admin console — all 10 remaining pages + signal/quality/ops infrastructure
...
Admin console (issues #63–72):
- Event stream viewer: live-tail ring buffer (500 events) with subject/user filters
- Feature store browser: per-user feature vector history from ml/serving
- Model registry panel: MLflow embed at /admin/models
- Experiment dashboard: LinUCB per-user stats (pulls, reward, θ) + bandit reset
- Recommendation log: per-tip explainability (policy, score, features, latency)
- Reward analytics: daily reaction breakdown + per-policy compare
- Data quality widget: missing-feature rate, stale-token rate, daily completeness
- Ops actions: replay-signal, policy enable/disable; user actions link to Users page
- SQL runner: read-only SELECT runner with saved queries
- Health rollup: fan-out to api/ml/sqlite/event-bus with auto-refresh
Backend:
- tip_scores table: logs features+policy+score+latency at every scoring call (#67 )
- saved_queries table: per-admin saved SQL (#71 )
- Event bus: 500-event ring buffer + tail() API (#63 )
- Admin routes: /events, /tips, /reward-analytics, /data-quality, /health,
/policies, /replay-signal, /sql, /saved-queries endpoints
- /api/ml/* admin-gated proxy to ml/serving (#64 , #66 )
- Shadow-policy registry in recommender (#56 )
ML serving:
- /reset/{user_id}: clear bandit state + feature history (#66 )
- /stats/{user_id}: pulls, cumulative reward, estimated mean, θ (#66 )
- /features/{user_id}: last 100 feature vectors logged at scoring time (#64 )
- Meta (pulls, rewards) persisted alongside A/b matrices
Web:
- Tip action sheet adds Helpful / Not helpful buttons (#62 )
- TipFeedback type extended with helpful/not_helpful actions
- Rewards mapped: helpful=+0.5, not_helpful=−0.5
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-16 03:56:48 +00:00