Rishika Bera

SKETCH TO REAL

Sketch2Real generates photorealistic images from colored sketches using a conditional diffusion model built on a U-Net architecture, trained on the COCO dataset.

Sketch Generation

Given a COCO image, a paired sketch is generated by converting to grayscale, inverting, applying Gaussian blur, thresholding for edges, then replacing those edges with the original colors.

Sketch generation

How it Works

The model reverses a diffusion process — starting from Gaussian noise and iteratively denoising, conditioned on the sketch. The U-Net takes the noisy image and sketch as a 6-channel input plus a timestep embedding, and predicts the noise to remove.

Model 1 — Proof of Concept

5K images at 128×128, cosine noise schedule, L1 loss. Training plateaued around epoch 160 — dataset too small to generalize further.

Epoch 160
Epoch 170
Epoch 180
Epoch 190
Epoch 200
Epoch 210
Epoch 220
Epoch 230
Epoch 240

Model 2 — Full Scale

118K images at 256×256, offset cosine schedule, MSE loss, CosineAnnealingLR. Significantly sharper results.

Epoch 5
Epoch 10
Epoch 20
Epoch 35
Epoch 50
Epoch 70

Check out the GitHub repo for full architecture details, training instructions, and dataset setup.