This project tackles the challenge of automatically colorizing grayscale astronomical images using advanced deep learning techniques. Specifically, we employ a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) architecture to transform grayscale images into vibrant, full-color representations.
Our approach leverages the LAB color space, using the L channel (luminance) as input and predicting the a and b channels (color information). This method allows the model to focus on color prediction while preserving the original image structure.
The best-performing model, featuring an EfficientNetB4 backbone in the generator and trained for 60 epochs, achieved a Peak Signal-to-Noise Ratio (PSNR) of 27.589277 on the validation set. This result was obtained using the WGAN-GP architecture with L1 loss incorporated in the generator.
Our project utilizes a combination of three astronomical image datasets:
- Top 100 Hubble Telescope Images
- ESA Hubble Images
- SpaceNet: A Comprehensive Astronomical Dataset
These datasets offer a diverse range of cosmic imagery, including galaxies, nebulae, star clusters, and various celestial phenomena.
Preprocessing Steps:
- Images are converted from RGB to the LAB color space.
- The L channel is extracted as the grayscale input.
- The a and b channels serve as color targets for the model to predict.
- All channels are normalized to the range [-1, 1] for optimal model performance.
Example images from the dataset:
- Based on a U-Net architecture with an EfficientNetB4 backbone
- Incorporates skip connections to preserve spatial information
- Uses upsampling layers in the decoder to produce the final colorized output
- PatchGAN architecture for local consistency in generated colors
- Convolutional layers for feature extraction
- No batch normalization (as per WGAN-GP guidelines)
- Outputs a scalar value for each image patch
The model was trained using the WGAN-GP framework, known for improved stability compared to traditional GANs. Key aspects include:
- Alternating training of generator and discriminator
- Gradient penalty to enforce the Lipschitz constraint
- L1 loss in the generator to encourage color fidelity
- Adam optimizer with β1 = 0 and β2 = 0.9
- Generator trained once every 3 steps
The model's performance improved through several iterations:
Changes Made | Steps/Epochs | Train PSNR | Validation PSNR |
---|---|---|---|
ResNet34 backbone for UNet generator, complex discriminator with dropout | 15k steps | 24.2 | 21.4 |
ResNet34 backbone for UNet generator, basic discriminator | 15k steps | 26.65 | 23.06 |
WGANs-gp with L1 loss in generator (lambda=10) | 5 epochs | 28 | 25 (fluctuating) |
Generator trained once every 3 steps, EfficientNetB2 backbone, resolution (224,224) | 62 epochs | 30 | 27.453756 |
EfficientNetB4 backbone | 60 epochs | 30 | 27.589277 |
The final model with EfficientNetB4 backbone demonstrated the best performance, achieving a validation PSNR of 27.589277.
While the current results are promising, several avenues for potential improvement exist:
-
Extended Training: Increase training to 150 epochs or more to potentially enhance colorization quality and PSNR scores.
-
Larger Backbones: Experiment with ResNet50 or larger EfficientNet variants to capture more complex features.
-
Hyperparameter Tuning: Fine-tune learning rates, batch sizes, and loss balancing for better results.
-
Data Augmentation: Implement more aggressive augmentation techniques for better generalization.
-
Ensemble Methods: Combine predictions from multiple models with different architectures or initializations.
-
Attention Mechanisms: Incorporate attention in the generator to focus on relevant colorization features.
-
Perceptual Loss: Add a perceptual loss term to improve visual quality beyond PSNR measurements.
-
Filtering some images and classes from SpaceNet dataset.
By implementing these improvements, we anticipate pushing the PSNR beyond 28 and achieving more vivid and accurate colorizations, particularly for complex astronomical phenomena.