[MM] GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
[MM] GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
[MM] GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
[MobileUI] From Perception to Reasoning: Enhancing Vision-Language Models for Mobile UI Understanding
[WebGUI] ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[RL] SimPO: Simple Preference Optimization with a Reference-Free Reward
[RL] From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function