Name	Name	Last commit message	Last commit date
Latest commit History 137 Commits
.github/workflows	.github/workflows	Initialize Template	Dec 4, 2024
.env	.env	add ollama example	Dec 4, 2024
LICENSE	LICENSE	Create LICENSE	Oct 6, 2024
README.md	README.md	add ollama example	Dec 4, 2024
scaffoldly.json	scaffoldly.json	add ollama example	Dec 4, 2024

Ollama Running On AWS Lambda

🚀 Working Example

Note

AWS Lambda uses CPUs, therefore running generate / chat is a little slow.
The first deployment takes ~5m while the container is built and models are cached, subsequent deployments take ~1m.
The first request while the model is loaded takes ~20s, subsequent requests take ~5-20s.
While this is not production grade, it is a cost effective way to serve models.

curl https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws/api/generate -d '{
  "model": "llama3.2:1b",
  "prompt":"Why is the sky blue?"
}'

🙏 Please, please, please don't abuse this endpoint, Scaffoldly is Open Source (a.k.a. cash strapped 🤣) and we're hosting it for demonstration purposes only!
Please consider donating if you like what Scaffoldly is doing!
Check out our other examples
Give our Tooling and Examples repositories a ⭐️ if you like what you see!

✨ Host Your Own!

Tip

To use a different model than llama3.2:1b, update scaffoldly.json with the desired model(s).

Run the following command to create your own copy of this application:

npx scaffoldly create app --template ollama

Create an EFS Filesystem in AWS, give it a Name of .cache (to match scaffoldly.json)
Finally, deploy:

cd my-app
npx scaffoldly deploy

You will see output that looks like:

🟠 App framework not detected. Using `scaffoldly.json` for configuration.

✅ Updated Identity: arn:aws:sts::123456789012:assumed-role/aws-examples@scaffold.ly/cnuss
✅ Updated ECR Repository: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ollama
✅ Updated Local Image Digest: sha256:f7ee27705d66c64a250982d6ee8282d5338a4989ae95c5ac4453a15c264efc97
✅ Updated Secret: arn:aws:secretsmanager:us-east-1:123456789012:secret:ollama@ollama-yaVNCp
✅ Updated EFS Access Point: arn:aws:elasticfilesystem:us-east-1:123456789012:access-point/fsap-0b0e5506324efd541
✅ Updated IAM Role: ollama-0447aaae
✅ Updated IAM Role Policy: ollama
✅ Updated Lambda Function: ollama
✅ Updated Function URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws
✅ Updated Schedule Group: ollama-0447aaae
✅ Updated Local Image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ollama:0.0.0-0-0447aaae
✅ Updated Local Image Digest: sha256:320447c49d08d109c4fc1702acc24768657a9a09e4e0eb90f8b32051500664ba
✅ Updated Secret: arn:aws:secretsmanager:us-east-1:123456789012:secret:ollama@ollama-yaVNCp
✅ Updated Lambda Function: ollama
✅ Updated Function Code: ollama@sha256:320447c49d08d109c4fc1702acc24768657a9a09e4e0eb90f8b32051500664ba
✅ Updated Function Alias: ollama (version: 4)
✅ Updated Function Policies: InvokeFunctionUrl
✅ Updated Function URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws
✅ Updated Network Interface: eni-0dc0e11444fa19715
✅ Created Invocation of `( HOME=$XDG_CACHE_HOME OLLAMA_HOST=$URL ollama pull llama3.2:1b )`:
pulling manifest
   ==> pulling 74701a8c35f6... 100% ▕████████████████▏ 1.3 GB
   ==> pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB
   ==> pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB
   ==> pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB
   ==> pulling 4f659a1e86d7... 100% ▕████████████████▏  485 B
   ==> verifying sha256 digest
   ==> writing manifest
   ==> success
✅ Updated HTTP GET on https://wm4s6cx...s-east-1.on.aws: 200 OK

🚀 Deployment Complete!
   🆔 App Identity: arn:aws:iam::123456789012:role/ollama-0447aaae
   📄 Env Files: .env.ollama, .env.main, .env
   📦 Image Size: 4.81 GB
   🌎 URL: https://wm4s6cxkwua4ncx3skpdtdx27a0qzbnd.lambda-url.us-east-1.on.aws

🤨 How It Works

The scaffoldly.json is converted into a Multi-Stage Docker Build
A docker build is pushed to Amazon ECR
A Lambda Function is created to serve the image
Models are cached to Amazon EFS
Requests are proxied to the underlying Ollama server

Tip

This repoistory also comes with a GitHub Action so that deployments can occur from GitHub instead of being executed manually!

Multi-Stage Docker Build

After the project has been created, run npx scaffoldly show dockerfile to see the resultant Dockerfile:

FROM ollama/ollama:0.4.7 AS install-base
WORKDIR /var/task

FROM install-base AS build-base
WORKDIR /var/task
ENV PATH="/var/task:$PATH"
COPY . /var/task/

FROM install-base AS package-base
WORKDIR /var/task
ENV PATH="/var/task:$PATH"

FROM install-base AS runtime
WORKDIR /var/task
ENV PATH="/var/task:$PATH"
COPY --from=scaffoldly/scaffoldly:1 /linux/arm64/awslambda-entrypoint /var/task/.entrypoint
CMD [ "( HOME=$XDG_CACHE_HOME ollama serve )" ]