The Art of Gambling in a Team: Multi-Player Multi-Armed Bandits
Multi-Armed bandits are an elegant model of learning in an unknown and uncertain environment. Such models are relevant in many scenarios, and of late have received increased attention recently due to various problems of distributed control that have arisen in wireless networks, pricing models on the internet, etc. We consider a non-Bayesian multi-armed bandit setting proposed by Lai and Robbins in mid-80s. There are multiple arms each of which generates an i.i.d. reward from an unknown distribution. There are multiple…