-
Notifications
You must be signed in to change notification settings - Fork 6
Description
You understand that if you dont provide clear information on your "Problem", "Software and Hardware-Setup" and the "Steps to reproduce" your issue might be closed without any help, because your not willing to do your part?
- Yes i do!
Current behavior
Having a Framework Desktop with 128GB of RAM. Followed the setup with the latest proxmox.
Everything is marked as successful and i can see the GPU in the container.
when using docker to run ollama, or even using the provided ollama example, the models always get loaded on the CPU. do you have any advice/proper follow-up on such an issue?
ollama-main-gpu | time=2026-01-20T16:02:24.468Z level=INFO source=routes.go:1614 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:11.5.1 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
ollama-main-gpu | time=2026-01-20T16:02:24.468Z level=INFO source=images.go:499 msg="total blobs: 5"
ollama-main-gpu | time=2026-01-20T16:02:24.468Z level=INFO source=images.go:506 msg="total unused blobs removed: 0"
ollama-main-gpu | time=2026-01-20T16:02:24.469Z level=INFO source=routes.go:1667 msg="Listening on [::]:11434 (version 0.14.2)"
ollama-main-gpu | time=2026-01-20T16:02:24.469Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
ollama-main-gpu | time=2026-01-20T16:02:24.469Z level=WARN source=runner.go:485 msg="user overrode visible devices" HSA_OVERRIDE_GFX_VERSION=11.5.1
ollama-main-gpu | time=2026-01-20T16:02:24.469Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"
ollama-main-gpu | time=2026-01-20T16:02:24.469Z level=INFO source=server.go:429 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 33091"
ollama-main-gpu | time=2026-01-20T16:02:24.548Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.1 GiB" available="31.1 GiB"
ollama-main-gpu | time=2026-01-20T16:02:24.548Z level=INFO source=routes.go:1708 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"
Desired behavior
running ollama wih GPU
Links to screenshots
No response
Steps to reproduce
Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Software setup
- OS:
- Kernel:
Hardware setup
- CPU: Ryzen 395+
- RAM: 128GB
- Disk:
Additional context
No response