StudyPreprintWikiReinforcement LearningSequential DecisionsModerateMinimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPsRead full paper →AuthorsPierre Boudart, Pierre Gaillard, Alessandro RudiYear2026Read full paper →More Reinforcement Learning research