Research

Proximal Policy Optimization (PPO) https://arxiv.org/abs/1707.06347
Multi-Agent DDPG https://github.com/openai/maddpg
Monte Carlo Tree Search https://gnunet.org/sites/default/files/Browne%20et%20al%20-%20A%20survey%20of%20MCTS%20methods.pdf
Monte Carlo Tree Search and Reinforcement Learning https://www.jair.org/media/5507/live-5507-10333-jair.pdf
Cooperative Multi-Agent Learning https://link.springer.com/article/10.1007/s10458-005-2631-2
Opponent Modeling in Deep Reinforcement Learning http://www.umiacs.umd.edu/~hal/docs/daume16opponent.pdf
Machine Theory of Mind https://arxiv.org/pdf/1802.07740.pdf
Coordinated Multi-Agent Imitation Learning https://arxiv.org/pdf/1703.03121.pdf
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games https://arxiv.org/pdf/1603.01121.pdf andhttp://proceedings.mlr.press/v37/heinrich15.pdf
Autonomous Agents Modelling Other Agents http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/AIJ18-Albrecht.pdf

Provide feedback