[Layer] LayerD: Decomposing Raster Graphic Designs into Layers

3 minute read

1. Motivation

많은 그래픽 디자이너들은 layer representation으로 된 graphic design 기반으로 편집작업을 수행한다.
Adoebe Photoshop, Powerpoint같은 authoring tool은 완성된 디자인을 raster image형태도 제공하여, layer정보가 손실된다.
Raster Image는 layer정보 누락으로 편집하기 힘들다.

$\to$ inverse composition task를 수행하는 프로세스를 제안해보자!

Iterative top-layer matting과 background completion (inpainting)의 조합으로 구성된 LayerD를 제안함
- top-layer: occlusion없이 제일 앞에 등장하는 objects (주로 typography)
- LayerD
  - 고품질 graphic design dataset으로 학습한 top-layer matting model
  - Off-the-shelf inpainting model
  $\to$ 기존에 방식들(detection + segmentation + layer ordering)보다 간결함
검증용 Qaulity metrics를 새롭게 제안함
- Layer sequences간에 정렬 거리를 평가하는 Dynamic Time Warping
Layer Decomposition task에서 SOTA성능을 냄

alpha compositing기반 원본 이미지로 복원하는 연구
natural image (not graphic image)에서 object level decomposition하는 연구
- instance segmentation + depth estimation + background completion의 조합
graphic image기반 decomposition을 VLM기반으로 수행하는 연구

Image Matting: 이미지 내 object의 alpha mattes를 예측하는 task와 background inpainting
Foreground color estimation: foreground color를 결정하는 task
- Energy-based, deep learning based가 있음

Background는 texture가 없는 flat한 요소/background에 놓인다는 사전 지식을 이용해 layer decomposition quality를 향상시켜보자는 취지

Background refinement
- Target area를 $F_{\theta}$가예측한 alpha map기반으로 추출하고, 그 주변의 color gradient를 계산하여 dominant color (palette)를 percentage기반으로 추출.
- Lab Color Space상에서 가장 가까운 RGB value를 할당
- Background Completion model이 예측한 Background에 생성된 artifact를 개선하는데 활용
Foreground refinement
- Foreground matting의 artifact를 개선하고자, rule-based refinement를 적용
  - 연결된 영역을 기준으로 alpha-map을 분리함
  - olor gradient를 각 영역별로 계산하고 zero color gradient인 영역은 flat region으로 분류함
  - input image의 palette color와 matching하는 영역 (threshold 이상인 영역)에 대해 새로운 mask로 정의함
    - 추출된 mask 영역에서 top-layer matting & background 을 기반으로 alpha 값을 계산

정답셋의 layer와 예측한 layer갯수가 다를 경우, DTW기반 order-aware layer alignment 를 적용하여 계산
- visaul qaulity: S개의 layer별 유사도 (e)의 평균값
- Granulairty: 요구되는 편집 갯수 (작을수록 좋음)로, 인접한 layer간의 merging을 허용함 (편집거리 +1)
  - (-1) * (soft IoU) + RGB 값의 L1 distance로 정의함

Dataset
- Crello data기반 학습
  - train/val/test = 19,478 / 1,852 / 1,971
  - Layered 로 변경한 train /val pairs = 48,725 /4,674
- Text는 unique한 domain 특성을 지니므로, Evaluation에서 text 요소는 decomposition에서 제외됨.
Model
- Top-layers matting model: Swin-L + BiRefNet
정량적 결과
- Granulairty (RGB-L1 / softIoU)
  - 검증시 text decomposition을 제외했음에도, text를 추가한 학습이 성능이 좋아진다고 함
  - 이는 text도 vector shape의 variant이므로 decomposition 에 도움이 된다고 주장함
정성적 평가
Ablation
- Refinement 유무에 따른 분석
  - Naive: alpha matting을 predicted mask로 변경한 버전
  - Color est.: Inverse blending 적용한 버전
  - Color est. + BG ref.: Inverse blending + BG refinement 적용한 버전
- 그 정성적 결과