Shadow-deploy infrastructure for policies #56

Closed
opened 2026-04-13 14:35:37 +00:00 by alvis · 0 comments
Owner

Before LinUCB replaces Random, the recommender must support running N shadow policies per request, logging their picks and estimated rewards without affecting the user. Promotion from shadow → A/B → launch is gated on offline and online reward parity. ADR-0002 and metrics.md describe the gate.nnDone when: Random runs live while a shadow LinUCB logs its alternative picks for a week before any A/B starts.

Before LinUCB replaces Random, the recommender must support running N shadow policies per request, logging their picks and estimated rewards without affecting the user. Promotion from shadow → A/B → launch is gated on offline and online reward parity. ADR-0002 and `metrics.md` describe the gate.nn**Done when:** Random runs live while a shadow LinUCB logs its alternative picks for a week before any A/B starts.
alvis added this to the M1 — Real signal milestone 2026-04-13 14:35:37 +00:00
alvis added the ml label 2026-04-13 14:35:37 +00:00
alvis closed this issue 2026-04-16 03:56:24 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: alvis/oO#56