StudyPreprintWikiReinforcement LearningModerateTrust Region Policy OptimizationRead full paper →AuthorsJohn Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter AbbeelYear2015Read full paper →More Reinforcement Learning research