StudyPreprintWikiReinforcement LearningModerateSurvive or Collapse: The Asymmetric Roles of Data Gating and Reward Grounding in Self-Play RLRead full paper →AuthorsSophia Xiao Pu, Zhaotian Weng, Chengzhi Liu, Jayanth Srinivasa, Gaowen Liu, William Yang Wang, Xin Eric WangYear2026Read full paper →More Reinforcement Learning research