Replicating the results of this paper: https://arxiv.org/pdf/1608.00367.pdf
Authors: Chao Dong, Chen Change Loy, and Xiaoou Tang
InstitutionDepartment of Information Engineering, The Chinese University of Hong Kong
1. With Docker
cd app
docker build -t fsrcnn
docker run -p 8000:8000 fsrcnn
Paste this in your browser: http://localhost:8000/
2. Without Docker
cd app
python3 api/app.py
- Python==3.11.5
- NumPy==1.26.3
- Pytorch==2.1.2 (with cuda)
- Matplotlib
- PIL
- Pathlib
- glob
- zipfile
Note: I havent had the time to train a scale of 2 or 4 yet as it takes all day but it is coming soon
Eval. Mat | Scale | Paper | Mine |
---|---|---|---|
PSNR | 2 | 36.94 | 34.77 |
PSNR | 3 | 33.16 | 32.05 |
PSNR | 4 | 30.55 | 30.82 |
Original | Original Cropped | BICUBIC x3 | FSRCNN x3 |
Original | BICUBIC x3 | FSRCNN x3 | |
Original | Original Cropped | BICUBIC x3 | FSRCNN x3 MSE | FSRCNN x3 MAE |
Structure: Conv(5, d, 1) −> PReLU −> Conv(1, s, d) −> PReLU −> m×Conv(3, s, s) −> PReLU −> Conv(1, d, s) −> PReLU −> DeConv(9, 1, d)
Differences:
Instead of using L2 loss, as used in the paper, I used L1 loss as "using MSE or a metric based on MSE is likely to result in training finding a deep learning based blur filter, as that is likely to have the lowest loss and the easiest solution to converge to minimising the loss. A loss function that minimises MSE encourages finding pixel averages of plausible solutions that are typically overly smoothed and although minimising the loss, the generated images will have poor perceptual quality from a perspective of appealing to a human viewer."
I opted to use L1 loss because "with L1 loss, the goal is the least absolute deviations (LAD) to minimise the sum of the absolute differences between the ground truth and the predicted/generated image. MAE reduces the average error, whereas MSE does not. Instead, MSE is very prone to being affected by outliers. For Image Enhancement, MAE will likely result in an image which appears to be a higher quality from a human viewer’s perspective."
notebooks
- 02_sandbox.ipynb
- Jupyter notebook that contains everything in one place from ingestion to predictions. This is what I used as a rough draft before restructuring into
.py
files
- Jupyter notebook that contains everything in one place from ingestion to predictions. This is what I used as a rough draft before restructuring into
utils
-
helpers.py
- Python file containing helper functions I either created or found to assist with this project.
- Python file containing helper functions I either created or found to assist with this project.
-
datasets.py
- Python file containing the custom datasets needed to train this model. Includes the Train and Evaluation datasets as they require different things to function as needed.
- Python file containing the custom datasets needed to train this model. Includes the Train and Evaluation datasets as they require different things to function as needed.
-
models.py
- Python file that contains the model consisting of layers for feature extraction, shrinking, non-linear mapping, expanding, and deconvolution. Uses PReLU instead of ReLU as it is more stable and avoids 'dead features' caused by zero_grad.
- Python file that contains the model consisting of layers for feature extraction, shrinking, non-linear mapping, expanding, and deconvolution. Uses PReLU instead of ReLU as it is more stable and avoids 'dead features' caused by zero_grad.
-
train.py
- Python file that trains the model using methods train_step, test_step, and train. Evaluates the model using Peak Signal-to-Noise Ratio(PSNR) measured in db.