This project demonstrates how to generate high-quality, photorealistic images from textual descriptions using Stable Diffusion models. It is implemented in Python with Hugging Face's diffusers library and PyTorch, showcasing a variety of prompts and configurations for image generation.
Example output for "A futuristic cityscape with neon lights" (generated via dreamlike-diffusion-1.0)
This project leverages pre-trained Stable Diffusion models from Hugging Face's diffusers library to generate high-quality images from text prompts. It demonstrates customization through parameter tuning (e.g., resolution, inference steps) and compares outputs from different model variants.
- Generate images from text prompts using multiple Stable Diffusion models.
- Adjustable parameters for fine-tuning outputs:
- Image resolution (
height,width) - Inference steps (
num_inference_steps) - Batch generation (
num_images_per_prompt) - Negative prompting to exclude unwanted elements
- Image resolution (
- Built-in visualization with
matplotlib.
- Python 3.10+
- GPU-enabled environment (e.g., Google Colab, local CUDA setup)
- Libraries:
torch==2.5.1+cu121 # CUDA 12.1 compatible diffusers>=0.31.0 transformers>=4.46.2 accelerate>=1.1.1 matplotlib>=3.7.0 Pillow>=11.0.0
git clone https://github.com/Nazmul0005/Text2Image_Generation_HuggingFace.git- Install PyTorch with CUDA (adjust based on your CUDA version):
pip install torch==2.5.1+cu121 --extra-index-url https://download.pytorch.org/whl/cu121
- Install required libraries:
pip install diffusers transformers accelerate matplotlib Pillow
from diffusers import StableDiffusionPipeline
import torch
# Choose from supported models:
model_id = "dreamlike-art/dreamlike-diffusion-1.0" # Example model
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda") # Ensure GPU is usedprompt = "A futuristic cityscape with towering skyscrapers and neon lights"
image = pipe(prompt).images[0]
image.save("generated_image.png")def generate_image(pipe, prompt, params):
images = pipe(prompt, **params).images
# Visualization code (requires matplotlib)
# ... (see full function in the notebook)
params = {
"num_inference_steps": 50,
"height": 640,
"width": 512,
"num_images_per_prompt": 2,
"negative_prompt": "blurry, distorted, low quality"
}
generate_image(pipe, prompt, params)| Category | Prompt Example |
|---|---|
| Fantasy | "A majestic unicorn galloping through wildflowers under a golden sunset." |
| Cyberpunk | "A neon-lit cyberpunk city with flying cars and rain-soaked streets." |
| Cinematic | "A girl sitting with her tiger, golden lighting, cinematic atmosphere." |
| Historical | "A medieval marketplace with merchants, cobblestone streets, and horses." |
Example 1: Generate a Grungy Woman Traveling Between Dimensions
prompt = """dreamlikeart, a grungy young woman with rainbow hair, traveling between dimensions, dynamic pose, happy, soft eyes, and narrow chin, extreme bokeh, dainty figure, long hair straight down, torn kawali shirt, and baggy jeans"""
image = pipe(prompt).images[0]
plt.imshow(image)
plt.axis("off")Example 2: Customize Parameters
params = {
"num_inference_steps": 50,
"height": 640,
"width": 512,
"num_images_per_prompt": 2,
"negative_prompt": "ugly, distorted, low quality"
}
generate_image(pipe, prompt, params)Tested models include:
dreamlike-art/dreamlike-diffusion-1.0stabilityai/stable-diffusion-x1-base-1.0stabilityai/stable-diffusion-2-1-base
| Model ID | Description |
|---|---|
dreamlike-art/dreamlike-diffusion-1.0 |
Artistic, dreamlike imagery generation |
stabilityai/stable-diffusion-x1-base-1.0 |
Base Stable Diffusion model for high-quality outputs |
stabilityai/stable-diffusion-2-1-base |
Advanced model for diverse use cases |
stabilityai/stable-diffusion-2-1 |
Enhanced version with better output precision |
| Parameter | Description |
|---|---|
num_inference_steps |
Number of denoising steps (higher = more detailed, slower). Default: 50 |
height & width |
Output resolution. Recommended: Multiples of 64 (e.g., 512x768). |
num_images_per_prompt |
Number of images to generate in one batch. |
negative_prompt |
Exclude undesired elements (e.g., "ugly, distorted, text"). |
- Negative Prompting: Specify a negative prompt to exclude undesired elements from the image.
- Resolution Scaling: Customize image dimensions with height and width parameters.
- Iterative Improvements: Experiment with different models and parameters to achieve the best results.
- Code: This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
- Models: Check individual model licenses on Hugging Face Hub.
- Built with Hugging Face Diffusers.
- Models provided by Stability AI and Dreamlike Art.
Contributions are welcome! Here's how you can contribute:
-
Fork the repository.
-
2.Create a feature branch:
git checkout -b feature-name
-
Commit your changes:
git commit -m "Add some feature" -
Push to the branch:
git push origin feature-name
-
Open a pull request.
Note: For optimal performance, use a GPU environment like Google Colab. CPU inference is not recommended.
This README.md includes proper markdown formatting, code blocks, tables, and placeholders for images. Replace `generated_image_example.png` with your own output samples.


