Vector Policy Optimization: Training for Diversity Improves Test-Time Search | Steady Practice | SteadyPractice