DeepSeek on EKS Demo

Project Description

This project demonstrates the deployment and management of DeepSeek AI models on Amazon Elastic Kubernetes Service (EKS). It showcases various large language models, including Qwen-14B, Qwen-32B, and DeepSeek-R1-Distill-Llama-8B, using vLLM for efficient inference. The project also includes node pool management with Karpenter, performance testing capabilities, and a user-friendly web interface for interacting with the models.

Components

Deployments

Qwen-14B Deployment (qwen-14b-deployment.yaml)
- Deploys the DeepSeek-R1-Distill-Qwen-14B model
- Uses vLLM for serving
- Configured with GPU support
Qwen-32B Deployment (qwen-32b-deployment.yaml)
- Deploys the DeepSeek-R1-Distill-Qwen-32B model
- Similar configuration to Qwen-14B, but with adjusted resource requirements
DeepSeek-R1-Distill-Llama-8B Deployment (deployment.yaml)
- Deploys the DeepSeek-R1-Distill-Llama-8B model
- Uses vLLM for serving
- Configured with GPU support

Node Pools

GPU Node Pool (nodepool.yaml)
- Configures Karpenter for managing GPU-enabled nodes
- Supports various GPU instance types (g5, g6, g6e, p5, p4)
- Includes both spot and on-demand instances
ML Accelerator Node Pool (nodepool.yaml)
- Configures Karpenter for managing nodes with AWS Inferentia and Trainium
- Supports instance families: inf1, inf2, trn1, trn1n

Performance Testing

GenAI Performance Tool (genai-perf.yaml, prompts.sh)
- Deploys a Triton Inference Server for performance testing
- Includes scripts for running performance profiles on the deployed models

User Interface

Open-WebUI (open-webui.yaml)
- Deploys a web-based user interface for interacting with the DeepSeek AI models
- Connects to the deployed vLLM services

Setup and Usage

Prerequisites
- Amazon EKS cluster
- kubectl configured to access your cluster
- Karpenter installed and configured
Node Pool Configuration
- Apply the node pool configurations:
```
kubectl apply -f nodepool.yaml
```

Deployment

Apply the deployment YAML files:

kubectl apply -f qwen-14b-deployment.yaml
kubectl apply -f qwen-32b-deployment.yaml
kubectl apply -f deployment.yaml

Performance Testing
- Deploy the Triton Inference Server:
```
kubectl apply -f genai-perf.yaml
```
- Use the prompts.sh script to run performance tests
User Interface
- Deploy the Open-WebUI:
```
kubectl apply -f open-webui.yaml
```
- Access the UI through the exposed service

Project Structure

.
├── deepseek-using-vllm-on-eks
│   ├── chatbot-ui
│   │   ├── application
│   │   │   ├── app.py
│   │   │   ├── Dockerfile
│   │   │   └── requirements.txt
│   │   └── manifests
│   │       ├── deployment.yaml
│   │       └── ingress-class.yaml
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── main.tf
│   ├── manifests
│   │   ├── deepseek-deployment-gpu.yaml
│   │   └── gpu-nodepool.yaml
│   └── README.md
├── ec2nodepool.yaml
├── genai-perf.yaml
├── k8s-manifest
│   ├── genai-perf
│   │   ├── genai-perf-2409.yaml
│   │   └── genai-perf-2412.yaml
│   ├── priority-class.yaml
│   ├── sglang
│   │   ├── llama-8b-sglang.yaml
│   │   └── qwen-32b-sglang.yaml
│   └── vllm
│       ├── llama-8b-vllm.yaml
│       ├── qwen-14b-deployment.yaml
│       └── qwen-32b-deployment.yaml
├── nodepool.yaml
├── open-webui.yaml
├── prompts.sh
└── README.md

Contributing

Contributions to this project are welcome. Please refer to the CONTRIBUTING.md file for guidelines.

License

This project is licensed under [LICENSE_NAME]. Please see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepSeek on EKS Demo

Project Description

Components

Deployments

Node Pools

Performance Testing

User Interface

Setup and Usage

Project Structure

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
k8s-manifest		k8s-manifest
.gitignore		.gitignore
README.md		README.md
nodepool.yaml		nodepool.yaml
open-webui.yaml		open-webui.yaml
prompts.sh		prompts.sh

hustshawn/deepseek-on-eks-demo

Folders and files

Latest commit

History

Repository files navigation

DeepSeek on EKS Demo

Project Description

Components

Deployments

Node Pools

Performance Testing

User Interface

Setup and Usage

Project Structure

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages