feat(bandit): promote egreedy-v2 (D=12, profile features) as active policy (#99)
Offline sim gate passed — egreedy-v2 mean reward −0.629 vs egreedy-v1 −0.642 (5 users × 20 rounds, rule judge, seed 42). v2 wins 3/5 personas. - recommender.ts: switch remotePolicy() to /score/egreedy/v2 - recommender.ts: switch sendRewardWithRetry() to /reward/egreedy/v2 with profile_features payload so the ridge update uses the full D=12 vector - recommender.ts: re-fetch profile at feedback time (TTL-cached, near-instant) - ADR-0012: status Accepted → Promoted, promotion record appended Shadow entry egreedy-v2-shadow kept in registry (active: false) for rollback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# ADR-0012 — ε-greedy v2: profile features in the bandit (D=7→12)
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-04-25
|
||||
**Status:** Promoted
|
||||
**Date:** 2026-04-25 (accepted) / 2026-04-26 (promoted)
|
||||
**Issue:** #99
|
||||
|
||||
## Context
|
||||
@@ -106,3 +106,19 @@ projecting theta without the corresponding `A` matrix cannot be done correctly.
|
||||
the D=12 target in the issue spec and complicates the sim comparison. Deferred.
|
||||
|
||||
**In-place v1 promotion without shadow** — violates ADR-0002.
|
||||
|
||||
## Promotion record (2026-04-26)
|
||||
|
||||
Offline sim (`runner.py --policies egreedy-v1 egreedy-v2 --judge rule --n-users 5 --n-rounds 20 --seed 42`):
|
||||
|
||||
| policy | total reward | mean reward | pulls |
|
||||
|--------|-------------|-------------|-------|
|
||||
| egreedy-v1 | −64.20 | −0.6420 | 100 |
|
||||
| egreedy-v2 | −62.90 | −0.6290 | 100 |
|
||||
|
||||
**Gate passed** (v2 mean ≥ v1 mean). Per-persona: v2 wins deadline-driven, evening-relaxed, low-priority-first; v1 wins consistent-responder, overdue-ignorer.
|
||||
|
||||
Changes applied:
|
||||
- `recommender.ts` `remotePolicy()`: `/score/egreedy` → `/score/egreedy/v2`
|
||||
- `recommender.ts` `sendRewardWithRetry()`: `/reward/egreedy` → `/reward/egreedy/v2`, added `profile_features` to payload
|
||||
- Shadow entry `egreedy-v2-shadow` left in registry (`active: false`) for rollback.
|
||||
|
||||
Reference in New Issue
Block a user