[Agent] BannerAgency: Advertising Banner Design with Multimodal LLM Agents
[Agent] BannerAgency: Advertising Banner Design with Multimodal LLM Agents
[Agent] BannerAgency: Advertising Banner Design with Multimodal LLM Agents
[MM] GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
[MobileUI] From Perception to Reasoning: Enhancing Vision-Language Models for Mobile UI Understanding
[WebGUI] ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[RL] SimPO: Simple Preference Optimization with a Reference-Free Reward