A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
[WebAgent] A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
[WebAgent] A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
[MLLM] InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
[WebAgent] ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data
[WebAgent] UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning
[Chart] CHARTEDIT: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing