[MM] MMICL: EMPOWERING VISION-LANGUAGE MODEL WITH MULTI-MODAL IN-CONTEXT LEARNING
title: “[MM] MMICL: EMPOWERING VISION-LANGUAGE MODEL WITH MULTI-MODAL IN-CONTEXT LEARNING”
title: “[MM] MMICL: EMPOWERING VISION-LANGUAGE MODEL WITH MULTI-MODAL IN-CONTEXT LEARNING”
[MM] A Survey on Multimodal Large Language Models
[LG][CA] PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout
[LG] CGL-LO: Constrained Graphic Layout Generation via Latent Optimization
[DLA] RoDLA: Benchmarking the Robustness of Document Layout Analysis Models