UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning
[WebAgent] UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning
[WebAgent] UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning
[Chart] CHARTEDIT: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing
[WebAgent] A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
[Layout] PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation
[Layout] DocMark: Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding