StudyPreprintWikiReinforcement LearningSequential DecisionsModerateProximal Policy Optimization AlgorithmsRead full paper →AuthorsJohn Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg KlimovYear2017Read full paper →More Reinforcement Learning research