Neural Style Transfer

Neural style transfer is a Deep Learning computer vision technique for transferring the artistic style of the source image to the target image while preserving the content of the target image to create new artworks. This technique was pioneered by Leon A. Gatys in his paper A Neural Algorithm of Artistic Style.

The key finding of this paper is that the representations of content and style in the Convolutional Neural Network are well separable. That is, we can manipulate both representations independently to produce new, perceptually meaningful images. This is based on the fact that CNN's extract features hierarchically. The initial layers of the CNN might extract simple features such as edges,corners, etc(ie detailed pixel value) while the deeper layers extract more complex features such as shapes, regions of the image, etc. Based on this we can say that the style of the image can be extracted from the initial layers while the content can be extracted from deeper layers. In this context, style essentially means textures, colors, and visual patterns in the image, at various spatial scales; and the content is the higher-level macrostructure of the image.

The key notion behind all deep-learning algorithms is to define a loss function to specify what you want to achieve, and you minimize this loss. Here in Style Transfer, our goal is to conserve the content of the original image while adopting the style of the reference image and we do this by using two loss functions together, the content loss and the style loss.

In the above figure α and β are the weighting factors for content and style reconstruction, respectively.

The content loss is the L2 norm between the activations of a deeper layer in a pre-trained convnet, computed over the target image, and the activations of the same layer computed over the generated image as shown in the figure below. Here p and x is the original image and the image that is generated, and P^l and F^l their respective feature representations in layer l and F_ij^l is the activation of the i^th filter at position j in layer l.

The content loss only uses a single deep layer, but the style loss as defined by Gatys uses multiple layers of a convnet. He used the Gram matrix of a layer’s activations that is the inner product of the feature maps of a given layer. This inner product can be understood as representing a map of the correlations between the layer’s features. By including the feature correlations of multiple layers, we obtain a stationary, multi-scale representation of the input image, which captures its texture information but not the global arrangement. Now the style loss is computed as the L2 norm between the Gram matrices from the original image and the Gram matrices of the image to be generated. In the figure shown below a and x be the original image and the image that is generated, and A^l and G^l their respective style representation in layer l. The contribution of layer l to the total loss is given below by E^l and the total style loss by L_style. Here N^l is the number of feature maps each of size M^l for the given layer l and w^l are weighting factors of the contribution of each layer to the total loss.

The original paper was implemented using VGG-19 pretrained on imagenet here I shall implement the same using PyTorch. VGG-19 model architecture is shown below.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
.gitignore		.gitignore
Neural Style Transfer.ipynb		Neural Style Transfer.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Style Transfer

Credits

About

Releases

Packages

Languages

GauthamSks/Neural-StyleTransfer

Folders and files

Latest commit

History

Repository files navigation

Neural Style Transfer

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages