[MM] VisionZip: Longer is Better but Not Necessary in Vision Language Models
[MM] VisionZip: Longer is Better but Not Necessary in Vision Language Models
[MM] VisionZip: Longer is Better but Not Necessary in Vision Language Models
[OD] CoDETR: DETRs with Collaborative Hybrid Assignments Training
[MM] Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
[MM] Dense Connector for MLLMs
[Layout] VLT: Interactively Optimizing Layout Transfer for Vector Graphics