Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Ollama API with Podman Compose #93

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 78 additions & 2 deletions docs/ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,84 @@

### Ollama API

Since Alpaca doesn't expose any API, if you need other applications than Alpaca to interact with your ollama instance (for example an IDE) you should consider installing it [in a docker container](https://hub.docker.com/r/ollama/ollama).
Since Alpaca doesn't expose any API, if you need other applications than Alpaca to interact with your ollama instance (for example an IDE) you should consider installing it in a [container](https://hub.docker.com/r/ollama/ollama).

#### Quadlet (recommended)

Check notice on line 20 in docs/ai.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

docs/ai.md#L20

Expected: 1; Actual: 0; Below
`~/.local/share/systemd/ollama.container`
```
[Unit]
Description=Ollama Service
After=network.target local-fs.target

[Container]
Image=ollama/ollama:latest
ContainerName=ollama
AutoUpdate=yes
PublishPort=11434:11434
Volume=./ollama_v:/root/.ollama:z
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the volume path should be in the quadlet

Device=/dev/nvidia*:ro
Deploy=resources.reservations.devices.capabilities=gpu

[Service]
RestartUnlessStopped=yes
TimeoutStartSec=60s

[Install]
WantedBy=multi-user.target
```

```sh
❯ systemctl --user daemon-reload

# start Ollama podlet for current session
❯ systemctl --user start ollama
❯ systemctl --user status ollama

# start Ollama podlet automatically after reboot
❯ systemctl --user enable ollama

# connect to ollama
❯ ollama list

# download and run model https://ollama.com/search
❯ ollama run <model>
```


Check notice on line 61 in docs/ai.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

docs/ai.md#L61

Expected: 1; Actual: 2
#### Podman Compose

Check notice on line 62 in docs/ai.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

docs/ai.md#L62

Expected: 1; Actual: 0; Below
> **NOTE:** Podman needs to be run with sudo for nvidia gpu passthrough until [this](https://github.com/containers/podman/issues/19338) issue is fixed.
>
Create this `podman-compose.yaml file

```yaml
---
services:
ollama:
image: ollama/ollama
container_name: ollama
restart: unless-stopped
ports:
- 11434:11434
volumes:
- ./ollama_v:/root/.ollama:z
devices:
- nvidia.com/gpu=all
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu

```

`❯ sudo podman-compose up -d`

To do so, first configure docker to use the nvidia drivers (that come preinstalled with Bluefin) with:


#### Docker Compose

To do so, first configure docker to use the nvidia drivers (that come preinstalled with Bluefin).

```bash
sudo nvidia-ctk runtime configure --runtime=docker
Expand Down Expand Up @@ -45,6 +120,7 @@
- gpu
```


Finally, open a terminal in the folder containing the file just created and start the container with

```bash
Expand Down