research: next-gen ranking policies — Thompson sampling, neural bandits, hybrid #83
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Motivation
ε-greedy v1 is a strong baseline but has known limitations:
Research directions
1. Thompson Sampling
2. Neural Contextual Bandits
3. Hybrid: global model → per-user adaptation
4. Explore-then-commit
5. Reward model research
Methodology
ml/experiments/sim/)Tasks