Collection of papers in image synthesis.
- 🔥 2024.7: A new awesome list for papers dedicated for diffusion model. diffusion.md
Note: The following awesome list will not be maintained. Cut off in 2022.
flowchart TB
GAN[VanillaGAN, 2014] -- architecture tricks --> DCGAN[DCGAN, 2016]
DCGAN -- Progressive growing --> PG[PG-GAN, 2018]
PG --> BigGAN[BigGAN, 2019]
PG -- AdaIN, mapping network --> SG1[StyleGAN, 2019]
SG1 -- Weight demodulation --> SG2[StyleGAN2, 2020]
SG2 -- Translate and rotate equivariance --> SG3[StyleGAN3, 2021]
DCGAN -- Autoregressive transformer \n for vision tokens --> VQGAN
VQGAN -- transformers architecture \n of generator and discriminator --> TransGAN
Generative adversarial nets.
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio*
NeurIPS 2014. [PDF] [Tutorial] Cited:2075
DCGAN
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks.
Alec Radford, Luke Metz, Soumith Chintala.
ICLR 2016. [PDF] Cited:13117
PG-GAN
Progressive Growing of GANs for Improved Quality, Stability, and Variation.
Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen.
ICLR 2018. [PDF] Cited:6527
StyleGAN
A Style-Based Generator Architecture for Generative Adversarial Networks.
Tero Karras, Samuli Laine, Timo Aila.
CVPR 2019. [PDF] Cited:8707
BigGAN
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan.
ICLR 2019. [PDF] Cited:4748
StyleGAN2
Analyzing and Improving the Image Quality of StyleGAN.
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila.
CVPR 2020. [PDF] Cited:4863
VQGAN
Taming Transformers for High-Resolution Image Synthesis
Patrick Esser, Robin Rombach, Björn Ommer.
CVPR 2021. [PDF] [Project] Cited:1969
TransGAN
TransGAN: Two Transformers Can Make One Strong GAN, and That Can Scale Up
Yifan Jiang, Shiyu Chang, Zhangyang Wang.
CVPR 2021. [PDF] [Pytorch] Cited:312
StyleGAN3
Alias-Free Generative Adversarial Networks.
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, Timo Aila.
NeurIPS 2021. [PDF] [Project] Cited:1272
StyleSwin: Transformer-based GAN for High-resolution Image Generation
Bowen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo.
CVPR 2022. [PDF] Cited:171
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Axel Sauer, Katja Schwarz, Andreas Geiger
SIGGRAPH 2022. [PDF] Cited:351
summary
1. Propose architecture changes based on Projected GANs. (1) Regularization: Only appying path length regularization after model has been sufficiently trained. Blur all images with gaussian filter for the first 200k images. (2) Reduce latent code z dimension to 64 and preserve w code to 512-d. (3) Use pretrained class embdding as conditioning and set it learnable. 2. Design a progressive growing strategy to StyleGAN3. 3. Leverage classifier guidance.A Large-Scale Study on Regularization and Normalization in GANs
Karol Kurach, Mario Lucic, Xiaohua Zhai, Marcin Michalski, Sylvain Gelly
ICML 2019. [PDF] Cited:147
EB-GAN
Energy-based Generative Adversarial Networks
Junbo Zhao, Michael Mathieu, Yann LeCun.
ICLR 2017. [PDF] Cited:1089
Towards Principled Methods for Training Generative Adversarial Networks
Martin Arjovsky, Léon Bottou
ICLR 2017. [PDF] Cited:1974
LSGAN
Least Squares Generative Adversarial Networks.
Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, Stephen Paul Smolley.
ICCV 2017. [PDF] Cited:4221
WGAN
Wasserstein GAN
Martin Arjovsky, Soumith Chintala, Léon Bottou.
ICML 2017. [PDF] Cited:4582
GGAN
Geometric GAN
Jae Hyun Lim, Jong Chul Ye.
Axiv 2017. [PDF] Cited:477
AC-GAN
Conditional Image Synthesis With Auxiliary Classifier GANs
Augustus Odena, Christopher Olah, Jonathon Shlens.
ICML 2017. [PDF] Cited:2975
cGANs with Projection Discriminator
Takeru Miyato, Masanori Koyama.
ICLR 2018. [PDF] Cited:728
S³-GAN
High-Fidelity Image Generation With Fewer Labels
Mario Lucic*, Michael Tschannen*, Marvin Ritter*, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly.
ICML 2019. [PDF] [Tensorflow] Cited:149
VAE
Auto-Encoding Variational Bayes.
Diederik P.Kingma, Max Welling.
ICLR 2014. [PDF] Cited:17730
AAE
Adversarial Autoencoders.
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey.
arxiv 2015. [PDF] Cited:2122
VAE/GAN
Autoencoding beyond pixels using a learned similarity metric.
Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther.
ICML 2016. [PDF] Cited:1900
VampPrior
VAE with a VampPrior
Jakub M. Tomczak, Max Welling.
AISTATS 2018. [PDF] [Pytorch] Cited:578
BiGAN
Adversarial Feature Learning
Jeff Donahue, Philipp Krähenbühl, Trevor Darrell.
ICLR 2017. [PDF] Cited:1758
AIL
Adversarial Learned Inference
Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville.
ICLR 2017. [PDF] Cited:1286
VEEGAN
Veegan: Reducing mode collapse in gans using implicit variational learning.
Akash Srivastava, Lazar Valkov, Chris Russell, Michael U. Gutmann, Charles Sutton.
NeurIPS 2017. [PDF] [Github] Cited:626
AGE
Adversarial Generator-Encoder Networks.
Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky.
AAAI 2018. [PDF] [Pytorch] Cited:129
IntroVAE
IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis.
Huaibo Huang, Zhihang Li, Ran He, Zhenan Sun, Tieniu Tan.
NeurIPS 2018. [PDF] Cited:232
Disentangled Inference for {GAN}s with Latently Invertible Autoencoder
Jiapeng Zhu, Deli Zhao, Bo Zhang, Bolei Zhou
IJCV 2020. [PDF] Cited:29
ALAE
Adversarial Latent Autoencoders
Stanislav Pidhorskyi, Donald Adjeroh, Gianfranco Doretto.
CVPR 2020. [PDF] Cited:238
Variational Inference with Normalizing Flows
Danilo Jimenez Rezende, Shakir Mohamed
ICML 2015. [PDF] Cited:3619
Improved Variational Inference with Inverse Autoregressive Flow
Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, Max Welling
NeurIPS 2016. [PDF] Cited:1681
NVAE: A Deep Hierarchical Variational Autoencoder
Arash Vahdat, Jan Kautz
NeurIPS 2020. [PDF] Cited:735
Improved techniques for training score-based generative models.
Yang Song, Stefano Ermon
NeurIPS 2020. [PDF] Cited:864
DDPM
Denoising Diffusion Probabilistic Models
Jonathan Ho, Ajay Jain, Pieter Abbeel
NeurIPS 2020. [PDF] Cited:9830
Score-based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole
ICLR 2021. [PDF] Cited:3827
Improved-DDPM
Improved Denoising Diffusion Probabilistic Models
Alex Nichol, Prafulla Dhariwal
ICML 2021. [PDF] Cited:2343
Variational Diffusion Models.
Diederik P. Kingma, Tim Salimans, Ben Poole, Jonathan Ho
NeurIPS 2021. [PDF] Cited:767
Guided-Diffusion
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal, Alex Nichol
NeurIPS 2021. [PDF] Cited:4798
Classifier-Free Diffusion Guidance.
Jonathan Ho, Tim Salimans
NeurIPS 2021. [PDF] Cited:2201
SDEdit: Image Synthesis and Editing with Stochastic Differential Equations
Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon
ICLR 2022. [PDF] Cited:904
DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
Gwanghyun Kim, Taesung Kwon, Jong Chul Ye
CVPR 2022. [PDF] Cited:443
Blended Diffusion: Text-driven Editing of Natural Images
Omri Avrahami, Dani Lischinski, Ohad Fried
CVPR 2022. [PDF] Cited:648
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, Mark Chen
ICML 2022. [PDF] Cited:2497
Palette: Image-to-Image diffusion models.
Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi
SIGGRAPH 2022. [PDF] Cited:1122
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc Van Gool
CVPR 2022 [PDF] Cited:918
DC-IGN
Deep Convolutional Inverse Graphics Network
Tejas D. Kulkarni, Will Whitney, Pushmeet Kohli, Joshua B. Tenenbaum.
NeurIPS 2015. [[PDF](Deep Convolutional Inverse Graphics Network)]
InfoGAN
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel.
NeurIPS 2016. [PDF] Cited:4014
Beta-VAE
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
I. Higgins, L. Matthey, Arka Pal, Christopher P. Burgess, Xavier Glorot, M. Botvinick, S. Mohamed, Alexander Lerchner.
ICLR 2017. [PDF]
AnnealedVAE
Understanding disentangling in β-VAE
Christopher P. Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, Alexander Lerchner.
NeurIPS 2017. [PDF] Cited:768
Factor-VAE
Disentangling by Factorising
Hyunjik Kim, Andriy Mnih.
NeurIPS 2017. [PDF] Cited:1219
DCI
A framework for the quantitative evaluation of disentangled representations.
Cian Eastwood, Christopher K. I. Williams
ICLR 2018. [PDF]
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations.
Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem.
ICML(best paper award) 2019. [PDF] Cited:1301
WGAN-GP
Improved training of wasserstein gans
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville.
NeurIPS 2017. [PDF] Cited:8625
The Numerics of GANs
Lars Mescheder, Sebastian Nowozin, Andreas Geiger
NeurIPS 2017. [PDF] Cited:440
R1-regularization
Which Training Methods for GANs do actually Converge?
Lars Mescheder, Andreas Geiger, Sebastian Nowozin.
ICML 2018. [PDF] Cited:1348
SN-GAN
Spectral Normalization for Generative Adversarial Networks.
Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida.
ICLR 2018. [PDF] Cited:4071
CR-GAN
Consistency regularization for generative adversarial networks.
Han Zhang, Zizhao Zhang, Augustus Odena, Honglak Lee.
ICLR 2020. [PDF] Cited:255
Summary
Motivation: GANs training is unstable. Traditional regularization methods introduce non-trivial computational overheads. Discrminator is easy to focus on local features instead of semantic information. Images of different semantic objects may be close in the discriminator's feature space due to their similarity in viewpoint.Method: Restrict the discrminator's intermediate features to be consistent under data augmentations of the same image. The generator doesn't need to change.
Experiment: (1) Augmentation details: randomly shifting the image by a few pixels and randomly flipping the image horizontally. (2) Effect of CR: Improve FID of generated images. (3) Ablation Study: Training with data augmentation will prevent discriminator from overfitting on training data, but not improve FID. The author claim this is due to consistency regularization further enforce the discriminator to learn a semantic representation.
Differentiable Augmentation for Data-Efficient GAN Training.
Zhao Shengyu, Liu Zhijian, Lin Ji, Zhu Jun-Yan, Han Song.
NeurIPS 2020. [PDF] [Project]
Cited:528
ICR-GAN
Improved consistency regularization for GANs.
Zhengli Zhao, Sameer Singh, Honglak Lee, Zizhao Zhang, Augustus Odena, Han Zhang.
AAAI 2021. [PDF] Cited:131
Summary
Motivation: The consistency regularization will introduce artifacts into GANs sample correponding toMethod: 1. (bCR) In addition to CR, bCR also encourage discriminator output the same feature for generated image and its augmentation. 2. (zCR) zCR encourage discriminator insensitive to generated images with perturbed latent code, while encourage generator sensitive to that.
Experiment: the augmentation to image is same as CR-GAN, the augmentation to latent vector is guassian noise.
StyleGAN-ADA
Training Generative Adversarial Networks with Limited Data.
Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, Timo Aila.
NeurIPS 2020. [PDF] [Tensorflow] [Pytorch] Cited:1568
Gradient Normalization for Generative Adversarial Networks.
Yi-Lun Wu, Hong-Han Shuai, Zhi-Rui Tam, Hong-Yu Chiu.
ICCV 2021. [PDF] Cited:52
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data.
Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy.
NeurIPS 2021. [PDF] Cited:80
Inception-Score/IS
Improved Techniques for Training GANs
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen.
NeurIPS 2016. [PDF] Cited:8114
FID, TTUR
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Sepp Hochreiter.
NeurIPS 2017. [PDF] Cited:433
SWD
Sliced Wasserstein Generative Models
Jiqing Wu, Zhiwu Huang, Dinesh Acharya, Wen Li, Janine Thoma, Danda Pani Paudel, Luc Van Gool.
CVPR 2019. [PDF] Cited:0
FastGAN
Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis
Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal.
ICLR 2021. [PDF] Cited:192
ProjectedGAN
Projected GANs Converge Faster
Axel Sauer, Kashyap Chitta, Jens Müller, Andreas Geiger
[PDF] [Project] [Pytorch] Cited:188
Transferring GANs: generating images from limited data.
Yaxing Wang, Chenshen Wu, Luis Herranz, Joost van de Weijer, Abel Gonzalez-Garcia, Bogdan Raducanu.
ECCV 2018. [PDF] Cited:257
Image Generation From Small Datasets via Batch Statistics Adaptation.
Atsuhiro Noguchi, Tatsuya Harada.
ICCV 2019 [PDF] Cited:183
Freeze Discriminator: A Simple Baseline for Fine-tuning GANs.
Sangwoo Mo, Minsu Cho, Jinwoo Shin.
CVPRW 2020 [PDF] [Pytorch] Cited:192
Resolution dependant GAN interpolation for controllable image synthesis between domains.
Justin N. M. Pinkney, Doron Adler
NeruIPS workshop 2020. [PDF] Cited:125
Few-shot image generation with elastic weight consolidation.
Yijun Li, Richard Zhang, Jingwan Lu, Eli Shechtman
NeruIPS 2020. [PDF] Cited:150
Minegan: effective knowledge transfer from gans to target domains with few images.
Yaxing Wang, Abel Gonzalez-Garcia, David Berga, Luis Herranz, Fahad Shahbaz Khan, Joost van de Weijer
CVPR 2020. [PDF] Cited:169
One-Shot Domain Adaptation For Face Generation
Chao Yang, Ser-Nam Lim
CVPR 2020. [PDF] Cited:35
Unsupervised image-to-image translation via pre-trained StyleGAN2 network
Jialu Huang, Jing Liao, Sam Kwong
TMM 2021. [PDF] Cited:58
Few-shot Adaptation of Generative Adversarial Networks
Esther Robb, Wen-Sheng Chu, Abhishek Kumar, Jia-Bin Huang.
arxiv 2020 [PDF] Cited:85
AgileGAN: stylizing portraits by inversion-consistent transfer learning.
Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chunpong Lai, Chuanxia Zheng, Tat-Jen Cham
TOG/SIGGRAPH 2021. [PDF] [Project]
Few-shot Image Generation via Cross-domain Correspondence
Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zhang.
CVPR 2021. [PDF] Cited:214
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, Daniel Cohen-Or.
arxiv 2021 [PDF] [Project] Cited:161
Stylealign: Analysis and Applications of Aligned StyleGAN Models
Zongze Wu, Yotam Nitzan, Eli Shechtman, Dani Lischinski
ICLR 2022. [PDF] Cited:47
One-Shot Generative Domain Adaptation
Ceyuan Yang, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Zhirong Wu, Bolei Zhou
arXiv 2021. [PDF] Cited:41
Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks
Peihao Zhu, Rameen Abdal, John Femiani, Peter Wonka
ICLR 2022. [PDF] Cited:76
Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment
Jiayu Xiao, Liang Li, Chaofei Wang, Zheng-Jun Zha, Qingming Huang
CVPR 2022. [PDF] Cited:55
JoJoGAN: One Shot Face Stylization
Min Jin Chong, David Forsyth
arxiv 2022. [PDF] Cited:59
When why and which pretrained GANs are useful?
Timofey Grigoryev, Andrey Voynov, Artem Babenko
ICLR 2022. [PDF]
CtlGAN: Few-shot Artistic Portraits Generation with Contrastive Transfer Learning
Yue Wang, Ran Yi, Ying Tai, Chengjie Wang, and Lizhuang Ma
arxiv 2022. [PDF] Cited:12
One-Shot Adaptation of GAN in Just One CLIP
Gihyun Kwon, Jong Chul Ye
arxiv 2022. [PDF] Cited:33
A Closer Look at Few-shot Image Generation
Yunqing Zhao, Henghui Ding, Houjing Huang, Ngai-Man Cheung
CVPR 2022. [PDF] Cited:54
Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal
arxiv 2022. [PDF] Cited:27
Domain Expansion of Image Generators
Yotam Nitzan, Michaël Gharbi, Richard Zhang, Taesung Park, Jun-Yan Zhu, Daniel Cohen-Or, Eli Shechtman
arxiv 2023. [PDF] Cited:12
Variational Inference with Normalizing Flows
Danilo Jimenez Rezende, Shakir Mohamed
ICML 2015. [PDF] Cited:3619
Improved Variational Inference with Inverse Autoregressive Flow
Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, Max Welling
NeurIPS 2016. [PDF] Cited:1681
NVAE: A Deep Hierarchical Variational Autoencoder
Arash Vahdat, Jan Kautz
NeurIPS 2020. [PDF] Cited:735
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
Anh Nguyen, Jeff Clune, Yoshua Bengio, Alexey Dosovitskiy, Jason Yosinski
CVPR 2017. [PDF] Cited:622
GLO
Optimizing the Latent Space of Generative Networks
Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam
ICML 2018. [PDF] Cited:393
Non-Adversarial Image Synthesis with Generative Latent Nearest Neighbors
Yedid Hoshen, Jitendra Malik
CVPR 2019. [PDF] Cited:55
Sampling generative networks: Notes on a few effective techniques.
Tom White.
arxiv 2016 [PDF] Cited:71
Latent space oddity: on the curvature of deep generative models
Georgios Arvanitidis, Lars Kai Hansen, Søren Hauberg.
ICLR 2018. [PDF] Cited:233
Feature-Based Metrics for Exploring the Latent Space of Generative Models
Samuli Laine.
ICLR 2018 Workshop. [PDF]
VQ-VAE
Neural Discrete Representation Learning
Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu
NeurIPS 2017. [PDF] Cited:3523
VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2
Ali Razavi, Aaron van den Oord, Oriol Vinyals
NeurIPS 2019. [PDF] Cited:1392
VQGAN
Taming Transformers for High-Resolution Image Synthesis
Patrick Esser, Robin Rombach, Björn Ommer.
CVPR 2021. [PDF] [Project] Cited:1969
DALLE
Zero-Shot Text-to-Image Generation
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever.
ICML 2021. [PDF] Cited:3626
The Image Local Autoregressive Transformer
Chenjie Cao, Yuxin Hong, Xiang Li, Chengrong Wang, Chengming Xu, XiangYang Xue, Yanwei Fu
NeruIPS 2021. [PDF] Cited:12
MaskGIT
MaskGIT: Masked Generative Image Transformer
Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman
arxiv 2022. [PDF] Cited:357
VQGAN-CLIP
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson, Stella Biderman, Daniel Kornis, Dashiell Stander, Eric Hallahan, Louis Castricato, Edward Raff
arxiv 2021. [PDF] Cited:304
ASSET
Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu, Sandesh Shetty, Tobias Hinz, Matthew Fisher, Richard Zhang, Taesung Park, Evangelos Kalogerakis
SIGGRAPH 2022. [Pytorch]
CLIP-GEN
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao Wang, Wei Liu, Qian He, Xinglong Wu, Zili Yi
arxiv 2022. [PDF] Cited:59
PUT
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu
CVPR 2022. [Pytorch]
High-Quality Pluralistic Image Completion via Code Shared VQGAN
Chuanxia Zheng, Guoxian Song, Tat-Jen Cham, Jianfei Cai, Dinh Phung, Linjie Luo
arxiv 2022. [PDF] Cited:8
L-Verse: Bidirectional Generation Between Image and Text
Taehoon Kim, Gwangmo Song, Sihaeng Lee, Sangyun Kim, Yewon Seo, Soonyoung Lee, Seung Hwan Kim, Honglak Lee, Kyunghoon Bae
CVPR 2022. [PDF] Cited:21
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes.
Sam Bond-Taylor, Peter Hessey, Hiroshi Sasaki, Toby P. Breckon, Chris G. Willcocks
arxiv 2021. [PDF] Cited:54
MaskGIT
MaskGIT: Masked Generative Image Transformer
Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman
arxiv 2022. [PDF] Cited:357
Imagebart: Bidirectional context with multinomial diffusion for autoregressive image synthesis.
Patrick Esser, Robin Rombach, Andreas Blattmann, Björn Ommer
NeruIPS 2021. [PDF] Cited:128
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo
CVPR 2022. [PDF] Cited:552
Improved Vector Quantized Diffusion Models
Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
arxiv 2022. [PDF] Cited:50
Text2Human
Text2Human: Text-Driven Controllable Human Image Generation
Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu
SIGGRAPH 2022. [PDF] Cited:30
RQ-VAE
Autoregressive image generation using residual quantization
Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
CVPR 2022. [PDF] Cited:149
Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
iGAN
Generative Visual Manipulation on the Natural Image Manifold
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros.
ECCV 2016. [PDF] [github] Cited:1341
IcGAN
Invertible Conditional GANs for image editing
Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, Jose M. Álvarez
NIPS 2016 Workshop. [PDF] Cited:626
Neural photo editing with introspective adversarial networks
Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston.
ICLR 2017. [PDF] Cited:442
Inverting The Generator of A Generative Adversarial Network.
Antonia Creswell, Anil Anthony Bharath.
NeurIPS 2016 Workshop. [PDF] Cited:309
GAN Paint
Semantic Photo Manipulation with a Generative Image Prior
David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba.
SIGGRAPH 2019. [PDF] Cited:321
GANSeeing
Seeing What a GAN Cannot Generate.
David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba.
ICCV 2019. [PDF] Cited:275
summary
Summary: To see what a GAN cannot generate (mode collapse problem), this paper first inspects the distribution of semantic classes of generated images compared with groundtruth images. Sencond, by inverting images, the failure cases of image instances can be directly observed.Class Distribution-Level Mode Collapse: StyleGAN outperforms WGAN-GP.
Instance Level Mode Collapse with GAN Inversion: (1) Use intermediate features instead of initial latent code as the optimization target. (2) Propose layer-wise inversion to learn the encoder for inversion, note this inversion output z coe. (3) Use restirction on z code to regularilize the inversion of intermediate feature.
Experiment: (1) Directly optimization on z not work. (2) encoder + optimization works better (3) Layer-wise inversion obviously better.
Limitation: Layer-wise inversion is not performed on StyleGAN.
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
Rameen Abdal, Yipeng Qin, Peter Wonka.
ICCV 2019. [PDF] Cited:986
Image2StyleGAN++: How to Edit the Embedded Images?
Rameen Abdal, Yipeng Qin, Peter Wonka.
CVPR 2020. [PDF] Cited:502
IDInvert
In-Domain GAN Inversion for Real Image Editing
Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou.
ECCV 2020. [PDF] Cited:592
summary
Motivation: Traditional GAN Inversion method train the encoder in the latent space via optimizing the distance to |E(G(z))-z|. However, the gradient to encoder is agnostic about the semantic distribution of generator's latent space. (For example, latent code far from mean vector is less editable.) This paper first train a domain-guided encoder, and then propose domain-regularized optimization by involving the encoder as a regularizer to finetune the code produced by the encoder and better recover the target image.Method: (1) Objective for training encoder: MSE loss and perceptual loss for reconstructed real image, adversarial loss. (2) Objective for refining embeded code: perceptual loss and MSE for reconstructed image, distance from to inverted code by encoder as regularization.
Experiment: (1) semantic analysis of inverted code: Train attribute boundry of inverted code with InterFaceGAN, compared with Image2StyleGAN, the Precision-Recall Curve performs betters. (2) Inversion Quality: Compared by FID, SWD, MSE and visual quality. (3) Application: Image Interpolation, Semantic Manipulation, Semantic Diffusion(Inversion of Composed image and then optimize with only front image), Style Mixing (4) Ablation Study: Larger weight for encoder bias the optimization towards the domain constraint such that the inverted codes are more semantically meaningful. Instead, the cost is that the target image cannot be ideally recovered for per-pixel values.
Editing in Style: Uncovering the Local Semantics of GANs
Edo Collins, Raja Bala, Bob Price, Sabine Süsstrunk.
CVPR 2020. [PDF] [Pytorch] Cited:258
summary
StyleGAN's style code controls the global style of images, so how to make local manipulation based on style code? Remeber that the style code is to modulate the variance of intermediate variations, different channels control different local semantic elements like noise and eyes. So we can identity the channel most correlated to the region of interest for local manipulation, and then replace value of source image style code of that channel with corresponding target channel.Details: The corresponding between RoI and channel is measured by feature map magnitude within each cluster, and the cluster is calculated from spherical k-means on features in 32x32 layer. Limitation: This paper actually does local semantic swap, and interpolation is not available.
Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space
Jonas Wulff, Antonio Torralba
arxiv 2020. [PDF] Cited:43
Improved StyleGAN Embedding: Where are the Good Latents?
Peihao Zhu, Rameen Abdal, Yipeng Qin, John Femiani, Peter Wonka
arxiv 2020. [PDF] Cited:104
pix2latent
Transforming and Projecting Images into Class-conditional Generative Networks
Minyoung Huh,Richard Zhang,Jun-Yan Zhu,Sylvain Paris,Aaron Hertzmann
ECCV 2020. [PDF] Cited:103
pSp,pixel2style2pixel
Encoding in style: a stylegan encoder for image-to-image translation.
CVPR 2021. [PDF] [Pytorch] Cited:962
e4e, encode for editing
Designing an encoder for StyleGAN image manipulation.
Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or.
SIGGRAPH 2021. [PDF] Cited:653
ReStyle
Restyle: A residual-based stylegan encoder via iterative refinement.
Yuval Alaluf, Or Patashnik, Daniel Cohen-Or.
ICCV 2021. [PDF] [Project] Cited:310
Collaborative Learning for Faster StyleGAN Embedding.
Shanyan Guan, Ying Tai, Bingbing Ni, Feida Zhu, Feiyue Huang, Xiaokang Yang.
arxiv 2020. [PDF] Cited:98
Summary
1. Motivation: Traditional methods either use optimization based of learning based methods to get the embeded latent code. However, the optimization based method suffers from large time cost and is sensitive to initiialization. The learning based method get relative worse image quality due to the lack of direct supervision on latent code.2. This paper introduce a collaborartive training process consisting of an learnable embedding network and an optimization-based iterator to train the embedding network. For each training batch, the embedding network firstly encode the images as initialization code of the iterator, then the iterator update 100 times to optimize MSE and LPIPS loss of generated images with target image, after that the updated embedding code is used as target signal to train the embedding network with latent code distance, image-level and feature-level loss.
3. The embedding network consists of a pretrained Arcface model as identity encoder, an attribute encoder built with ResBlock, the output identity feature and attribute feature are combined via linear modulation(denomarlization in SPADE). After that a Treeconnect(a sparse alterative to fully-connected layer) is used to output the final embeded code.
Pivotal Tuning for Latent-based Editing of Real Images
Daniel Roich, Ron Mokady, Amit H. Bermano, Daniel Cohen-Or.
arxiv 2021. [PDF] Cited:440
HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing.
Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, Amit H. Bermano.
CVPR 2022 [PDF] [Project] Cited:213
High-Fidelity GAN Inversion for Image Attribute Editing
Tengfei Wang, Yong Zhang, Yanbo Fan, Jue Wang, Qifeng Chen.
CVPR 2022. [PDF] Cited:211
GAN Dissection: Visualizing and Understanding Generative Adversarial Networks
David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba.
ICLR 2019. [PDF] [Project]. Cited:0
On the "steerability" of generative adversarial networks.
Ali Jahanian, Lucy Chai, Phillip Isola.
ICLR 2020. [PDF] [Project] [Pytorch] Cited:370
Controlling generative models with continuous factors of variations.
Antoine Plumerault, Hervé Le Borgne, Céline Hudelot.
ICLR 2020. [PDF] Cited:114
InterFaceGAN
Interpreting the Latent Space of GANs for Semantic Face Editing
Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou.
CVPR 2020. [PDF] [Project] Cited:1007
Enjoy your editing: Controllable gans for image editing via latent space navigation
Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing
ICLR 2021. [PDF] Cited:67
Only a matter of style: Age transformation using a style-based regression model.
Yuval Alaluf, Or Patashnik, Daniel Cohen-Or
SIGGRAPH 2021. [PDF] Cited:116
Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes.
Huiting Yang, Liangyu Chai, Qiang Wen, Shuang Zhao, Zixun Sun, Shengfeng He.
CVPR 2021. [PDF]
StyleSpace
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
Zongze Wu, Dani Lischinski, Eli Shechtman.
CVPR 2021. [PDF] Cited:426
StyleFlow
StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows
Rameen Abdal, Peihao Zhu, Niloy Mitra, Peter Wonka.
SIGGRAPH 2021. [PDF] Cited:462
A Latent Transformer for Disentangled Face Editing in Images and Videos.
Xu Yao, Alasdair Newson, Yann Gousseau, Pierre Hellier.
ICCV 2021. [PDF] [ArXiV] [Github] Cited:70
Controllable and Compositional Generation with Latent-Space Energy-Based Models.
Weili Nie, Arash Vahdat, Anima Anandkumar.
NeurIPS 2021. [PDF] Cited:65
EditGAN
EditGAN: High-Precision Semantic Image Editing
Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler.
NeurIPS 2021. [PDF] Cited:172
StyleFusion
StyleFusion: A Generative Model for Disentangling Spatial Segments
Omer Kafri, Or Patashnik, Yuval Alaluf, Daniel Cohen-Or
arxiv 2021. [PDF] Cited:34
flowchart TD
root(Unsupervised GAN Manipulation) --> A(Mutual information)
root --> B[Generator Parameter]
root --> C[Training Regularization]
A --> E[Unsupervised Discovery. Voynov. ICML 2020]
InfoGAN -- on pretrained network --> E
E -- RBF Path --> Warped[WarpedGANSpace. Tzelepis. ICCV 2021]
E -- Parameter Space --> NaviGAN[NaviGAN. Cherepkov. CVPR 2021]
E -- Contrastive Loss --> DisCo[Disco. Ren. ICLR 2022]
B -- PCA on Intermediate/W space --> GANSpace[GANSpace. Härkönen. NIPS 2020]
GANSpace -- Closed-form Factorization of Weight --> SeFa[SeFa. Shen. CVPR 2021]
GANSpace -- Spatial Transformation \n on intermediate Feature --> GANS[GAN Steerability. Eliezer. ICLR 2021]
SeFa -- Variation for intermediate features --> VisualConcept[Visual Concept Vocabulary. Schwettmann. ICCV 2021]
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space.
Andrey Voynov, Artem Babenko.
ICML 2020. [PDF] Cited:367
GANSpace
GANSpace: Discovering Interpretable GAN Controls
Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, Sylvain Paris.
NeurIPS 2020 [PDF] [Pytorch] Cited:805
The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement
William Peebles, John Peebles, Jun-Yan Zhu, Alexei Efros, Antonio Torralba
ECCV 2020 [PDF] [Project] Cited:107
The Geometry of Deep Generative Image Models and its Applications
Binxu Wang, Carlos R. Ponce.
ICLR 2021. [PDF] Cited:39
GAN Steerability without optimization.
Nurit Spingarn-Eliezer, Ron Banner, Tomer Michaeli
ICLR 2021. [PDF] Cited:53
SeFa
Closed-Form Factorization of Latent Semantics in GANs
Yujun Shen, Bolei Zhou.
CVPR 2021 [PDF] [Project] Cited:526
NaviGAN
Navigating the GAN Parameter Space for Semantic Image Editing
Anton Cherepkov, Andrey Voynov, Artem Babenko.
CVPR 2021 [PDF] [Pytorch] Cited:60
EigenGAN: Layer-Wise Eigen-Learning for GANs.
Zhenliang He, Meina Kan, Shiguang Shan.
ICCV 2021. [PDF] [Github] Cited:43
Toward a Visual Concept Vocabulary for GAN Latent Space.
Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba.
ICCV 2021. [PDF] [Project]
WarpedGANSpace: Finding Non-linear RBF Paths in GAN Latent Space.
Christos Tzelepis, Georgios Tzimiropoulos, Ioannis Patras.
ICCV 2021. [PDF] [Github] Cited:53
OroJaR: Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation.
Yuxiang Wei, Yupeng Shi, Xiao Liu, Zhilong Ji, Yuan Gao, Zhongqin Wu, Wangmeng Zuo.
ICCV 2021. [PDF] [Github] Cited:45
Optimizing Latent Space Directions For GAN-based Local Image Editing.
Ehsan Pajouheshgar, Tong Zhang, Sabine Süsstrunk.
arxiv 2021. [PDF] [Pytorch] Cited:11
Discovering Density-Preserving Latent Space Walks in GANs for Semantic Image Transformations.
Guanyue Li, Yi Liu, Xiwen Wei, Yang Zhang, Si Wu, Yong Xu, Hau San Wong.
ACM MM 2021. [PDF]
Disentangled Representations from Non-Disentangled Models
Valentin Khrulkov, Leyla Mirvakhabova, Ivan Oseledets, Artem Babenko
arxiv 2021. [PDF] Cited:14
Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs.
Jaewoong Choi, Changyeon Yoon, Junho Lee, Jung Ho Park, Geonho Hwang, Myungjoo Kang.
ICLR 2022. [PDF] Cited:23
Disco
Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View
Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng
ICLR 2022. [PDF] Cited:26
Rayleigh EigenDirections (REDs): GAN latent space traversals for multidimensional features.
Guha Balakrishnan, Raghudeep Gadde, Aleix Martinez, Pietro Perona.
arxiv 2022. [PDF]
Low-Rank Subspaces in GANs
Jiapeng Zhu, Ruili Feng, Yujun Shen, Deli Zhao, Zhengjun Zha, Jingren Zhou, Qifeng Chen
NeurIPS 2021. [PDF] Cited:59
Region-Based Semantic Factorization in GANs
Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen.
arxiv 2022. [PDF] Cited:24
Fantastic Style Channels and Where to Find Them: A Submodular Framework for Discovering Diverse Directions in GANs
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, Dani Lischinski
ICCV 2021. [PDF] [Pytorch] Cited:1007
TargetCLIP
Image-Based CLIP-Guided Essence Transfer
Chefer, Hila and Benaim, Sagie and Paiss, Roni and Wolf, Lior
arxiv 2021. [PDF] Cited:45
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders.
Kevin Frans, L.B. Soros, Olaf Witkowski.
Arxiv 2021. [PDF] Cited:160
CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions.
Omer Kafri, Or Patashnik, Yuval Alaluf, and Daniel Cohen-Or
arxiv 2021. [PDF] Cited:91
FEAT: Face Editing with Attention
Xianxu Hou, Linlin Shen, Or Patashnik, Daniel Cohen-Or, Hui Huang
arxiv 2021. [PDF] Cited:16
StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation
Peter Schaldenbrand, Zhixuan Liu, Jean Oh
NeurIPS 2021 Workshop. [PDF]
CLIPstyler: Image Style Transfer with a Single Text Condition
Gihyun Kwon, Jong Chul Ye
CVPR 2022. [PDF] Cited:190
HairCLIP: Design Your Hair by Text and Reference Image
Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Zhentao Tan, Lu Yuan, Weiming Zhang, Nenghai Yu
CVPR 2022. [PDF] Cited:81
CLIPasso: Semantically-Aware Object Sketching
Yael Vinker, Ehsan Pajouheshgar, Jessica Y. Bo, Roman Christian Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, Ariel Shamir
arxiv 2022. [PDF] Cited:42
A good image generator is what you need for high-resolution video synthesis
Yu Tian, Jian Ren, Menglei Chai, Kyle Olszewski, Xi Peng, Dimitris N. Metaxas, Sergey Tulyakov.
ICLR 2021. [PDF] Cited:163
Latent Image Animator: Learning to animate image via latent space navigation.
Yaohui Wang, Di Yang, Francois Bremond, Antitza Dantcheva.
ICLR 2022. [PDF]
pix2pix
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros.
CVPR 2017. [PDF] Cited:17644
CRN
Photographic Image Synthesis with Cascaded Refinement Networks
Qifeng Chen, Vladlen Koltun.
ICCV 2017. [PDF] Cited:905
pix2pixHD
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro.
CVPR 2018. [PDF] Cited:3580
SPADE
Semantic Image Synthesis with Spatially-Adaptive Normalization
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu.
CVPR 2019. [PDF] Cited:2379
SEAN
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka.
CVPR 2020. [PDF] Cited:407
You Only Need Adversarial Supervision for Semantic Image Synthesis
Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva.
ICLR 2021. [PDF] Cited:156
Diverse Semantic Image Synthesis via Probability Distribution Modeling
Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu.
CVPR 2021. [PDF] Cited:58
Efficient Semantic Image Synthesis via Class-Adaptive Normalization
Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu.
TPAMI 2021. [PDF]
Spatially-adaptive pixelwise networks for fast image translation.
Tamar Rott Shaham, Michael Gharbi, Richard Zhang, Eli Shechtman, Tomer Michaeli
CVPR 2021. [PDF] Cited:63
High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network
Jie Liang, Hui Zeng, Lei Zhang.
CVPR 2021. [PDF] Cited:80
Context encoders: Feature learning by inpainting.
Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, Alexei A. Efros
CVPR 2016. [PDF] Cited:4929
Globally and locally consistent image completion.
Satoshi Iizuka, Edgar Simo-Serra, Hiroshi Ishikawa
SIGGRAPH 2017. [PDF]
Semantic image inpainting with deep generative models.
Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do
CVPR 2017. [PDF] Cited:1130
High-resolution image inpainting using multiscale neural patch synthesis
Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, Hao Li
CVPR 2017. [PDF] Cited:747
Spg-net: Segmentation prediction and guidance network for image inpainting.
Yuhang Song, Chao Yang, Yeji Shen, Peng Wang, Qin Huang, C.-C. Jay Kuo
BMVC 2018. [PDF] Cited:165
Generative image inpainting with contextual attention
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang
CVPR 2018. [PDF] Cited:2069
Free-form image inpainting with gated convolution.
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas Huang
ICCV 2019. [PDF] Cited:1519
Edgeconnect: Generative image inpainting with adversarial edge learning.
Kamyar Nazeri, Eric Ng, Tony Joseph, Faisal Z. Qureshi, Mehran Ebrahimi
ICCV 2019. [PDF] Cited:630
Pluralistic Image Completion
Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
CVPR 2019. [PDF] Cited:418
Rethinking image inpainting via a mutual encoder-decoder with feature equalizations.
Hongyu Liu, Bin Jiang, Yibing Song, Wei Huang, Chao Yang
ECCV 2020. [PDF] Cited:242
High-Fidelity Pluralistic Image Completion with Transformers
Ziyu Wan, Jingbo Zhang, Dongdong Chen, Jing Liao
ICCV 2021. [PDF] Cited:180
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu
CVPR 2022. [PDF] Cited:55
Deep Identity-Aware Transfer of Facial Attributes
Mu Li, Wangmeng Zuo, David Zhang
arxiv 2016. [PDF] Cited:145
Sketch Your Own GAN
Sheng-Yu Wang, David Bau, Jun-Yan Zhu
ICCV 2021. [PDF] Cited:63
High-Resolution Daytime Translation Without Domain Labels
I. Anokhin, P. Solovev, D. Korzhenkov, A. Kharlamov, T. Khakhulin, A. Silvestrov, S. Nikolenko, V. Lempitsky, and G. Sterkin.
CVPR 2020. [PDF] Cited:65
Information Bottleneck Disentanglement for Identity Swapping
Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He
CVPR 2021. [PDF]
Swapping Autoencoder for Deep Image Manipulation
Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang
NeurIPS 2020. [PDF] Cited:297
L2M-GAN: Learning to Manipulate Latent Space Semantics for Facial Attribute Editing
Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang
CVPR 2021. [PDF]
Coupled Generative Adversarial Networks
Ming-Yu Liu, Oncel Tuzel.
NeurIPS 2016 [PDF]
UNIT
Unsupervised Image-to-Image Translation Networks.
Ming-Yu Liu,Thomas Breuel,Jan Kautz
NeurIPS 2017. [PDF] Cited:2575
CycleGAN
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros.
ICCV 2017. [PDF] Cited:5530
DiscoGAN
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, Jiwon Kim.
ICML 2017. [PDF] Cited:1898
DualGAN
DualGAN: Unsupervised Dual Learning for Image-to-Image Translation
Zili Yi, Hao Zhang, Ping Tan, Minglun Gong.
ICCV 2017. [PDF] Cited:1858
BicycleGAN
Toward Multimodal Image-to-Image Translation
Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, Eli Shechtman.
NeurIPS 2017. [PDF] Cited:1276
MUNIT
Multimodal Unsupervised Image-to-Image Translation
Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz.
ECCV 2018. [PDF] Cited:2288
DRIT
Diverse Image-to-Image Translation via Disentangled Representations
Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Kumar Singh, Ming-Hsuan Yang.
ECCV 2018. [PDF] Cited:825
Augmented cyclegan: Learning many-to-many mappings from unpaired data.
Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville.
ICML 2018. [PDF] Cited:400
MISO: Mutual Information Loss with Stochastic Style Representations for Multimodal Image-to-Image Translation.
Sanghyeon Na, Seungjoo Yoo, Jaegul Choo.
BMVC 2020. [PDF] Cited:16
MSGAN
Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis
Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, Ming-Hsuan Yang.
CVPR 2019. [PDF] Cited:371
U-GAT-IT
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
Junho Kim, Minjae Kim, Hyeonwoo Kang, Kwanghee Lee
ICLR 2020. [PDF] Cited:489
UVC-GAN
UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation
Dmitrii Torbunov, Yi Huang, Haiwang Yu, Jin Huang, Shinjae Yoo, Meifeng Lin, Brett Viren, Yihui Ren
arxiv 2022. [PDF] Cited:46
DistanceGAN
One-Sided Unsupervised Domain Mapping
Sagie Benaim, Lior Wolf
NIPS 2017. [PDF] Cited:279
Council-GAN
Breaking the Cycle - Colleagues are all you need
Ori Nizan , Ayellet Tal
CVPR 2020. [PDF]
ACL-GAN
Unpaired Image-to-Image Translation using Adversarial Consistency Loss
Yihao Zhao, Ruihai Wu, Hao Dong.
ECCV 2020. [PDF] Cited:97
CUT
Contrastive Learning for Unpaired Image-to-Image Translation
Taesung Park, Alexei A. Efros, Richard Zhang, Jun-Yan Zhu.
ECCV 2020. [PDF] Cited:959
The spatially-correlative loss for various image translation tasks
Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai.
CVPR 2021. [PDF]
Unsupervised Image-to-Image Translation with Generative Prior
Shuai Yang, Liming Jiang, Ziwei Liu and Chen Change Loy.
CVPR 2022. [PDF] Cited:27
StarGAN
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, Jaegul Choo
CVPR 2018. [PDF] Cited:3277
DRIT++
DRIT++: Diverse Image-to-Image Translation via Disentangled Representations
Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Singh, Ming-Hsuan Yang.
IJCV 2019. [PDF] Cited:761
StarGANv2
StarGAN v2: Diverse Image Synthesis for Multiple Domains
Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha
CVPR 2020. [PDF] Cited:1464
Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation
Yahui Liu, Enver Sangineto, Yajing Chen, Linchao Bao, Haoxian Zhang, Nicu Sebe, Bruno Lepri, Wei Wang, Marco De Nadai
CVPR 2021. [PDF] Cited:38
A Style-aware Discriminator for Controllable Image Translation
Kunhee Kim, Sanghun Park, Eunyeong Jeon, Taehun Kim, Daijin Kim
CVPR 2022. [PDF] Cited:20
DualStyleGAN
Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
Shuai Yang, Liming Jiang, Ziwei Liu and Chen Change Loy
CVPR 2022. [Pytorch]
Unsupervised Cross-Domain Image Generation
Yaniv Taigman, Adam Polyak, Lior Wolf
ICLR 2017. [PDF] Cited:964
FUNIT
Few-shot unsupervised image-to-image translation.
Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz
ICCV 2019. [PDF] Cited:556
Coco-funit:Few-shot unsupervised image translation with a content conditioned style encoder.
Kuniaki Saito, Kate Saenko, Ming-Yu Liu
ECCV 2020. [PDF] Cited:77
Attribute Group Editing for Reliable Few-shot Image Generation.
Guanqi Ding, Xinzhe Han, Shuhui Wang, Shuzhe Wu, Xin Jin, Dandan Tu, Qingming Huang
CVPR 2022. [PDF] Cited:16
WCT
Universal Style Transfer via Feature Transforms
Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang.
NeruIPS 2017. [PDF] Cited:873
Style transfer by relaxed optimal transport and self-similarity.
Nicholas Kolkin, Jason Salavon, Greg Shakhnarovich.
CVPR 2019. [PDF] Cited:246
A Closed-Form Solution to Universal Style Transfer
Ming Lu, Hao Zhao, Anbang Yao, Yurong Chen, Feng Xu, Li Zhang
ICCV 2019. [PDF] Cited:70
Neural Neighbor Style Transfer
Nicholas Kolkin, Michal Kucera, Sylvain Paris, Daniel Sykora, Eli Shechtman, Greg Shakhnarovich
arxiv 2022. [PDF] Cited:21
GANgealing
GAN-Supervised Dense Visual Alignment
William Peebles, Jun-Yan Zhu, Richard Zhang, Antonio Torralba, Alexei Efros, Eli Shechtman.
arxiv 2021. [PDF] Cited:59
Generating images from captions with attention.
Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov.
ICLR 2016. [PDF] Cited:412
Generative Adversarial Text to Image Synthesis
Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee>.
ICML 2016. [PDF] Cited:2931
StackGAN
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.
ICCV 2017. [PDF] Cited:2550
StackGAN++
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas
TPAMI 2018. [PDF] Cited:964
MirrorGAN: Learning Text-to-image Generation by Redescription
Tingting Qiao, Jing Zhang, Duanqing Xu, Dacheng Tao
CVPR 2019. [PDF] Cited:491
AttnGAN
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He.
CVPR 2018. [PDF] Cited:1518
DM-GAN
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
Minfeng Zhu, Pingbo Pan, Wei Chen, Yi Yang
CVPR 2019. [PDF] Cited:505
SD-GAN
Semantics Disentangling for Text-to-Image Generation
Guojun Yin, Bin Liu, Lu Sheng, Nenghai Yu, Xiaogang Wang, Jing Shao
CVPR 2019. [PDF] Cited:166
DF-GAN
A Simple and Effective Baseline for Text-to-Image Synthesis
Ming Tao, Hao Tang, Fei Wu, Xiaoyuan Jing, Bingkun Bao, Changsheng Xu.
CVPR 2022. [PDF] Cited:158
Text to Image Generation with Semantic-Spatial Aware GAN
Kai Hu, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn
CVPR 2022. [PDF] Cited:81
TextFace: Text-to-Style Mapping based Face Generation and Manipulation
Hou, Xianxu, Zhang Xiaokang, Li Yudong, Shen Linlin
TMM 2022. [PDF]
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization
Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, Qiang Liu
arxiv 2021. [PDF] Cited:71
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Axel Sauer, Tero Karras, Samuli Laine, Andreas Geiger, Timo Aila
arxiv 2023. [PDF] Cited:141
GigaGAN
Scaling up GANs for Text-to-Image Synthesis
Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park
CVPR 2023. [PDF] Cited:289
DALLE
Zero-Shot Text-to-Image Generation
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever.
ICML 2021. [PDF] Cited:3626
GLIDE
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, Mark Chen
arxiv 2021. [PDF] [Pytorch]
DALLE2
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen
OpenAI 2022. [PDF]
L-Verse: Bidirectional Generation Between Image and Text
Taehoon Kim, Gwangmo Song, Sihaeng Lee, Sangyun Kim, Yewon Seo, Soonyoung Lee, Seung Hwan Kim, Honglak Lee, Kyunghoon Bae
CVPR 2022. [PDF] Cited:21
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao Wang, Wei Liu, Qian He, Xinglong Wu, Zili Yi
arxiv 2022. [PDF] Cited:59
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan
arxiv 2023. [PDF] Cited:360
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, Fang Wen
arxiv 2022. [PDF] Cited:142
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu, Jian Liang, Lei Ji, Fan Yang, Yuejian Fang, Daxin Jiang, Nan Duan
ECCV 2022. [PDF] Cited:246
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu, Jian Liang, Xiaowei Hu, Zhe Gan, Jianfeng Wang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan
NIPS 2022. [PDF] Cited:52
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon
ICLR 2022. [PDF] Cited:904
Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami, Dani Lischinski, Ohad Fried
CVPR 2022. [PDF] Cited:648
DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
Gwanghyun Kim, Taesung Kwon, Jong Chul Ye
CVPR 2022. [PDF] Cited:443
Text2LIVE: text-driven layered image and video editing.
Bar-Tal, Omer and Ofri-Amar, Dolev and Fridman, Rafail and Kasten, Yoni and Dekel, Tali
arxiv 2022. [PDF] Cited:244
Textual Inversion
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or
arxiv 2022. [PDF] Cited:1166
DreamBooth
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman
arxiv 2022. [PDF] Cited:1706
Prompt-to-Prompt Image Editing with Cross-Attention Control
Amir Hertz, Ron Mokady, Jay Tenenbaum, Kfir Aberman, Yael Pritch, and Daniel Cohen-Or
ICLR 2023. [PDF]
Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, Michal Irani
arxiv 2022. [PDF] Cited:718
UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image
Dani Valevski, Matan Kalman, Yossi Matias, Yaniv Leviathan
arxiv 2022. [PDF] Cited:19
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks, Aleksander Holynski, Alexei A. Efros
arxiv 2022. [PDF] Cited:988
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang
arxiv 2022. [PDF] Cited:62
Multi-Concept Customization of Text-to-Image Diffusion
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
arxiv 2022. [PDF] Cited:498
Zero-shot Image-to-Image Translation
[Project]
Null-text Inversion for Editing Real Images using Guided Diffusion Models
[PDF] [Project] Cited:476
Imagen video: High definition video generation with diffusion models
Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans
arxiv 2022. [PDF] [Project] Cited:979
Video diffusion models.
Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, David J. Fleet
arxiv 2022. [PDF] Cited:888
Make-a-video: Text-to-video generation without text-video data
Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman
arxiv 2022. [PDF] Cited:854
Tune-A-Video: Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou
arxiv 2022. [PDF] Cited:431
DIP
Deep Image Prior
Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky.
CVPR 2018 [PDF] [Project] Cited:2732
SinGAN
SinGAN: Learning a Generative Model from a Single Natural Image
Tamar Rott Shaham, Tali Dekel, Tomer Michaeli.
ICCV 2019 Best Paper. [PDF] [Project] Cited:754
TuiGAN
TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images
Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo.
ECCV 2020. [PDF] Cited:54
DeepSIM
Image Shape Manipulation from a Single Augmented Training Sample
Yael Vinker, Eliahu Horwitz, Nir Zabari , Yedid Hoshen.
ICCV 2021. [PDF] [Project] [Pytorch] Cited:16
SemanticGAN
Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization
Daiqing Li, Junlin Yang, Karsten Kreis, Antonio Torralba, Sanja Fidler.
CVPR 2021. [PDF] Cited:151
DatasetGAN
DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort
Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler.
CVPR 2021. [PDF] Cited:279
SemanticStyleGAN
SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing
Yichun Shi, Xiao Yang, Yangyue Wan, Xiaohui Shen.
arxiv 2021. [PDF] Cited:67
Learning to generate line drawings that convey geometry and semantics
Caroline Chan, Fredo Durand, Phillip Isola.
arxiv 2022. [PDF] Cited:58
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks.
Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, Jeff Clune.
NIPS 2016. [PDF] Cited:651
Generating Images with Perceptual Similarity Metrics based on Deep Networks.
Alexey Dosovitskiy, Thomas Brox
NIPS 2016. [PDF] Cited:1082
VectorFusion Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Ajay Jain, Amber Xie, Pieter Abbeel
arxiv 2022. [PDF] Cited:56