Recently, I have been doing projects with gangster algorithms. Basically, the performance of bandit algorithms is determined by a lot of data. And it is very good for continuous testing with data change. So what you need to do to test and configure your model when testing data.
To take more bandits, you can read this book, bandit algorithms for website optimization: http://shop.oreilly.com/product/0636920027393.do . He explains the basic gangster algorithms well enough and implements them in Python. You can find his code on Github: https://github.com/johnmyleswhite/BanditsBook . However, they did not talk about contextual bandits in the book.
For R, I'm not sure. But I just searched the Internet, I found a guy who implemented gangsters in R, here is the code: https://github.com/lotze/bandit
Hope this helps you.
source share