Refer to the attached project report
Authors: Varun Sundar, Abhay Kumar, Kalyani Unnikrishnan and Kriti Goyal.
Pre-requisites:
- conda
make install.cpu
or make install.gpu
as required.
Environment clip
will be created.
make help
to list all available commands.
Copy your WandB API key to wandb_api.key. Will be used to login to your dashboard for visualisation. Alternatively, you can skip W&B visualisation, and set wandb.use=False while running the python code or USE_WANDB=False while running make commands.
Available at outputs/ckpt
:
rosalinity-stylegan2-ffhq-config-f.pt
: from here, suitable ONLY for 256px.stylegan2-ffhq-config-f.pt
: used in the e4e paper, looks like the converted NVLab ckpt. Suitable for 1MPixel.
- FFHQ |
data/ffhq
: each image has original, estimated latent vector and inversion. - Celeba HQ |
data/celeba-hq
: has original, captions, latent (for a few) and inversion (for a few).
data/celeba-hq
|-- caption (30K txt files)
|-- latents (if you want to set GT to the inversion)
`-- rgb (30K jpg files)
python stylegan_optimizer.py task=super_resolution img=celeba-hq exp_name=stylegan_metrics img.index='range(0,100)' -m
For range
syntax, see here under section Range Sweep
.
If you wish to initialize the groundtruth explicitly from a StyleGAN latent file:
python stylegan_optimizer.py task=super_resolution img=celeba-hq-latent exp_name=stylegan_metrics img.index='range(0,100)' -m
You can run across multiple tasks as (but not recommended at first. Indeed, you may want to run different images for different tasks too.)
python stylegan_optimizer.py task=super_resolution,lensless img=celeba-hq-latent exp_name=stylegan_metrics img.index='range(0,100)' -m
Runs across cartesian product of tasks x range
.
-
Choosing between latent and image
We do this by looking at the extension of the path in
img.path
. If it is a.pth, .zip
, we load usingtorch.load
, treat it as the latent to StyleGANv2. Else if it one of.png, .jpg, .jpeg
, we load the image using OpenCV2. Seeutils.data.load_latent_or_img
. -
StyleGAN weights
We use the weights from the e4e repository, which seem to be the weights ported from NVLabs (tensorflow) to pytorch using rosinality The rosinality trained (not ported) weights however, are poor fidelity.
-
Caption: string vs text file
If string present, load it. Else load full text file or (if provided) limited number of lines. See
utils.data.load_caption
. -
Verbosity
Use
+silent=True
to suppress printing configs.
-
LSUN Church dataset: Seems to be 256ppx
-
Stanford Cars: 512ppx
python stylegan_optimizer.py --cfg job
We use hydra for configs. YAML files present under conf/.