🧩 GLM-Image Studio (ROCm)

GLM-Image Studio is a professional, high-performance AI creative suite designed specifically for AMD GPUs (ROCm). It features a modern, responsive Web UI that orchestrates Text-to-Image generation, Image-to-Image transformation, and advanced Visual Reasoning (Image-to-Text) using the latest GLM models.

Optimized for: AMD Radeon RX 7900 series / Instinct accelerators running ROCm 6.x via Docker.

✨ Key Features

🎨 Creative Suites

Text-to-Image (T2I): Generate high-fidelity images using Flux-based diffusion pipelines (zai-org/GLM-Image).
Image-to-Image (I2I): Transform existing images with natural language prompts.
Image-to-Text (I2T) with Thinking: Analyze images using zai-org/GLM-4.1V-9B-Thinking.
- Visual Thinking Process: View the model's internal step-by-step reasoning (collapsible view).
- Native Resolution: Supports up to 4K inputs for analyzing fine details.
- Structured Output: Separates the "Thinking Process" from the "Final Answer" for clarity.
Upscale & Refine: Tiled upscaling using stable-diffusion-x4-upscaler with variable scale (1x-4x), bfloat16 precision for artifact-free results, and post-process alignment.

🚀 Advanced-Grade UI

Smart History Gallery:
- Auto-Sorting: Newest generations always appear at the top.
- Universal Loaders: Load any history image into any input slot ([➜ 1], [➜ 2]) regardless of origin.
- Bulk Restore: One-click [All] button instantly reloads dual-source inputs.
- Compact Layout: Optimized 128px view with high-contrast timestamps (~20% more space efficient).
- Persistence: Automatically saves all generations to disk.
Model Management: Dynamic model loading from models/ directory with UI-based Trash/Delete operations.
Advanced Control: Selectable Samplers (Euler a, DPM++ 2M Karras, etc.) and native GGUF model support for low-VRAM environments.
LoRA Management: Hot-swappable LoRA adapters with strength control.
Real-time Monitoring: Integrated system status, timer, and console logs directly in the dashboard.
State Isolation: Independent prompt and result buffers for T2I, I2I, and I2T modes prevent accidental data loss.
Cross-Flow: Send generated images instantly from T2I -> I2I or analysis text from I2T -> T2I prompt.
Robust Lifecycle: Enhanced process management ensures clean shutdowns for all background workers.

⚙️ Backend Engineering

Modular Architecture: Isolated subprocesses for T2I, I2I, and I2T ensure stability and clean VRAM management.
Unified Storage: All uploads and generations are centrally managed in outputs/ with automatic collision handling (auto-renaming).
Zero-Config Deploy: Docker-based setup handles all ROCm dependencies and library conflicts.

⚡ High-Performance C++ Backend

Extreme Speed: 3x to 10x faster generation compared to standard Python pipelines.
Zero-Overhead: Uses stable-diffusion.cpp directly for maximum efficiency on ROCm.
Availability: Currently supported for Text-to-Image (T2I) and Image-to-Image (I2I) modes.
Usage: Simply click the blue "⚡ CPP Generate" button instead of the standard Generate button.

🛠️ Prerequisites

Linux OS: Ubuntu 22.04 or compatible.
AMD Hardware: GPU with ROCm support (e.g., RX 7900 XTX, MI300).
Docker & ROCm:
- Install Docker
- Ensure your user is in the video and render groups.
HuggingFace Token: Required to download the models.

🚀 Quick Start

1. Clone & Prepare

git clone https://github.com/your-username/glm-image-studio-rocm.git
cd glm-image-studio-rocm

# Create directories
mkdir -p outputs loras

2. Configure Environment

Create a .env file (optional, or pass via command line) if you modify the script, but the default script uses your host's HF cache. Ensure you have logged in to HuggingFace or have your token ready.

3. Build Container

docker build -t glm-image-rocm .

4. Run Studio

Use the provided script to mount volumes and map GPUs correctly:

chmod +x run_glm.sh
./run_glm.sh

The Web UI will be available at: http://localhost:7860

📁 Project Structure

/app
├── server.py           # FastAPI Backend & Orchestrator
├── process_t2i.py      # Independent T2I Worker
├── process_i2i.py      # Independent I2I Worker
├── process_i2t.py      # Independent I2T Worker
├── process_upscale.py  # Independent Upscale Worker
├── process_zimage.py   # Z-Image Turbo / GGUF Worker
├── shared_utils.py     # Shared logging & config logic
├── lora_manager.py     # LoRA scanning & config generation
├── run_glm.sh          # Docker launch script
├── Dockerfile          # ROCm Environment Definition
├── static/             # Frontend Assets (HTML/CSS/JS)
├── outputs/            # Stores ALL Generations & Uploads
└── loras/              # Place .safetensors adapters here

🧠 Supported Models

Generation: zai-org/GLM-Image (Flux.1 / SDXL styled pipelines)
Vision/Reasoning: zai-org/GLM-4.1V-9B-Thinking
Turbo: zai-org/Z-Image-Turbo (GGUF Quantized, fast inference)

🔧 Troubleshooting

Issue	Solution
VRAM OOM	The system auto-clears VRAM when switching tabs. Wait 2-3s between mode switches.
Model Load Fail	Verify your HuggingFace token and internet connectivity.
Permission Denied	Ensure `run_glm.sh` is executable (`chmod +x`).
Upload Error	Check if the `outputs/` directory is writable by the container user.

📜 License

This project is open-source and licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
basicsr_src		basicsr_src
libs		libs
logs		logs
static		static
.cursorrules		.cursorrules
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
download_worker.py		download_worker.py
lora_manager.py		lora_manager.py
process_i2i.py		process_i2i.py
process_i2i_cpp.py		process_i2i_cpp.py
process_i2t.py		process_i2t.py
process_restore.py		process_restore.py
process_t2i.py		process_t2i.py
process_t2i_cpp.py		process_t2i_cpp.py
process_upscale.py		process_upscale.py
process_zimage.py		process_zimage.py
run_glm.sh		run_glm.sh
server.py		server.py
shared_utils.py		shared_utils.py
web_ui.py		web_ui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧩 GLM-Image Studio (ROCm)

✨ Key Features

🎨 Creative Suites

🚀 Advanced-Grade UI

⚙️ Backend Engineering

⚡ High-Performance C++ Backend

🛠️ Prerequisites

🚀 Quick Start

1. Clone & Prepare

2. Configure Environment

3. Build Container

4. Run Studio

📁 Project Structure

🧠 Supported Models

🔧 Troubleshooting

📜 License

About

Uh oh!

Releases

Packages

Languages

License

abassign/GLM-Image-Studio

Folders and files

Latest commit

History

Repository files navigation

🧩 GLM-Image Studio (ROCm)

✨ Key Features

🎨 Creative Suites

🚀 Advanced-Grade UI

⚙️ Backend Engineering

⚡ High-Performance C++ Backend

🛠️ Prerequisites

🚀 Quick Start

1. Clone & Prepare

2. Configure Environment

3. Build Container

4. Run Studio

📁 Project Structure

🧠 Supported Models

🔧 Troubleshooting

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages