ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[WebGUI] ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[WebGUI] ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[RL] SimPO: Simple Preference Optimization with a Reference-Free Reward
[RL] From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function
[WebAgent] Learn-by-Interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
[Layout] AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models