Daily Shaarli
January 15, 2025
Treat an A/B test as an multi-armed bandit problem. First we need to record the performance for each option. Then every time we are faced with a decision, calculate the reward for each option, pick the best one. Leave a small chance (e.g. 10%) for exploration where a random option is selected.