-
Notifications
You must be signed in to change notification settings - Fork 0
README
VulkanIlm (from "Vulkan" ๐ฅ + "Ilm" ๐, meaning "knowledge" in Urdu/Arabic) is a Python library that brings GPU-accelerated local LLM inference to AMD and Intel GPU usersโno CUDA required.
Built specifically for developers with legacy GPUs, VulkanIlm enables blazing-fast local inference using llama.cpp's Vulkan backend.
Not everyone has an NVIDIA GPU. Everyone deserves fast local AI.
The Problem: Most GPU acceleration libraries focus on CUDA, leaving AMD/Intel users with slow CPU-only inference.
The Solution: VulkanIlm democratizes GPU-accelerated AI for all GPU brands using Vulkan.
- ๐ 4-6x faster than CPU inference on legacy GPUs
- ๐ฎ Universal GPU support: AMD, Intel, and NVIDIA
- ๐ Python-first with simple CLI tools
- โก Auto-detection and optimization for your GPU
- ๐ฆ Zero-config installation with auto-building
- ๐ Real-time streaming token generation
| Hardware | CPU Performance | Vulkan Performance | Speedup |
|---|---|---|---|
| AMD RX 580 8GB | 188.47s | 44.74s | 4.21x |
| Intel Arc A770 | ~120s | ~25s | 4.8x |
| AMD RX 6600 | ~90s | ~18s | 5.0x |
Benchmarked with Gemma-3n-E4B-it model (6.9B parameters)
git clone https://github.com/Talnz007/VulkanIlm.git
cd VulkanIlm
pip install -e .- Python 3.9+
- Vulkan-capable GPU (AMD RX 400+, Intel Arc/Xe, NVIDIA GTX 900+)
- Vulkan drivers installed
# Ubuntu/Debian
sudo apt install vulkan-tools libvulkan-dev
# Fedora/RHEL
sudo dnf install vulkan-tools vulkan-devel
# Test installation
vulkaninfo# Auto-install llama.cpp with Vulkan support
vulkanilm install
# Check your GPU setup
vulkanilm vulkan-info
# Download and use models
vulkanilm search "llama"
vulkanilm download microsoft/DialoGPT-medium
# Generate text
vulkanilm ask model.gguf --prompt "Explain quantum computing"
# Stream in real-time
vulkanilm stream model.gguf "Tell me a story about AI"
# Benchmark CPU vs GPU performance
vulkanilm benchmark "Test prompt"from vulkan_ilm import Llama
# Load model with automatic GPU optimization
llm = Llama("path/to/model.gguf", gpu_layers=16)
# Generate text
response = llm.ask("Explain the term 'ilm' in AI context.")
print(response)
# Stream tokens in real-time
for token in llm.stream_ask_real("Tell me about Vulkan API"):
print(token, end='', flush=True)| Brand | Models | Status | Performance |
|---|---|---|---|
| AMD | RX 580, RX 590, RX 6600, RX 6700 | โ Excellent | 4-5x speedup |
| Intel | Arc A770, Arc A750, Xe Graphics | โ Great | 4-5x speedup |
| NVIDIA | GTX 1060+, RTX series | โ Excellent | 4-6x speedup |
โ vulkanilm: command not found
# Ensure you installed the package correctly
pip install -e .
poetry install # if using Poetry
# Check if CLI is available
which vulkanilmโ No Vulkan support detected
# Install Vulkan tools and test
sudo apt install vulkan-tools libvulkan-dev
vulkaninfo
# Update GPU driversโ Model hangs or uses too much VRAM
# Use fewer GPU layers
vulkanilm ask model.gguf --gpu-layers 8 -p "test"
# Or force CPU mode for testing
vulkanilm ask model.gguf --cpu -p "test"VulkanIlm/
โโโ vulkan_ilm/
โ โโโ cli.py # Command-line interface
โ โโโ llama.py # Main Python API
โ โโโ vulkan/
โ โ โโโ detector.py # GPU detection & optimization
โ โโโ benchmark.py # Performance testing
โ โโโ installer.py # Auto-build llama.cpp
โ โโโ streaming.py # Real-time token streaming
โโโ pyproject.toml # Poetry configuration
โโโ README.md
We welcome contributions! Areas where you can help:
- GPU Testing: Test on different AMD/Intel/NVIDIA cards
- Model Support: Add support for new model formats
- Performance: Optimize memory usage and speed
- Documentation: Improve guides and examples
See CONTRIBUTING.md for details.
In South Asian and Islamic culture, "Ilm" (ุนูู ) represents knowledge, wisdom, and enlightenment.
Combined with Vulkanโa high-performance GPU APIโthis project embodies "knowledge on fire" ๐ฅ: making advanced AI accessible to everyone, regardless of their GPU brand or budget.
Our mission: Democratize local AI inference for the global developer community.
MIT License - see LICENSE for details.
- GitHub: https://github.com/Talnz007/VulkanIlm
- Issues: Report bugs or request features
- Discussions: Community Q&A
๐ฅ Built with passion by @Talnz007 โ Bringing fast, local AI to legacy GPUs everywhere