Skip to content

Commit 5bab6d0

Browse files
committed
Update Intel GPU HelmRelease configuration and kustomization to improve resource management and plugin functionality.
1 parent ada15e5 commit 5bab6d0

File tree

3 files changed

+34
-15
lines changed

3 files changed

+34
-15
lines changed

clusters/talos-ottawa/apps/cilium/app/helmrelease-intel-gpu.yaml

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22
apiVersion: helm.toolkit.fluxcd.io/v2
33
kind: HelmRelease
44
metadata:
5-
name: intel-device-plugins-gpu
5+
name: intel-gpu
66
namespace: kube-system
77
spec:
88
interval: 30m
99
chart:
1010
spec:
11-
chart: intel-device-plugins-gpu
11+
chart: intel-device-plugins-operator
1212
version: 0.34.0
1313
sourceRef:
1414
kind: HelmRepository
@@ -22,20 +22,24 @@ spec:
2222
remediation:
2323
retries: 3
2424
values:
25-
name: intel-gpu-plugin
25+
manager:
26+
image:
27+
hub: intel
28+
pullPolicy: IfNotPresent
2629

27-
# GPU Plugin Configuration
28-
sharedDevNum: 100 # Allow up to 100 containers to share each GPU
29-
logLevel: 2 # Verbose logging
30-
enableMonitoring: true # Enable monitoring resources
31-
allocationPolicy: "balanced" # Distribute workloads evenly across GPUs
30+
# Enable GPU device plugin
31+
devices:
32+
gpu: true
3233

33-
# Node selector - only run on nodes with Intel GPUs
34+
# Node selector for operator
3435
nodeSelector:
35-
intel.feature.node.kubernetes.io/gpu: 'true'
36+
kubernetes.io/arch: amd64
3637

37-
# Enable NodeFeatureRule for automatic node labeling
38-
nodeFeatureRule: true
39-
40-
# Tolerations (if needed)
41-
tolerations: []
38+
# Resource limits for operator
39+
resources:
40+
limits:
41+
cpu: 100m
42+
memory: 120Mi
43+
requests:
44+
cpu: 100m
45+
memory: 100Mi
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
apiVersion: deviceplugin.intel.com/v1
3+
kind: GpuDevicePlugin
4+
metadata:
5+
name: intel-gpu-plugin
6+
namespace: kube-system
7+
spec:
8+
image: intel/intel-gpu-plugin:0.34.0
9+
sharedDevNum: 100
10+
logLevel: 2
11+
enableMonitoring: true
12+
allocationPolicy: balanced
13+
nodeSelector:
14+
intel.feature.node.kubernetes.io/gpu: "true"

clusters/talos-ottawa/apps/cilium/config/kustomization.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@ resources:
66
- localredirects.yaml
77
- nodelocaldns.yaml
88
- coredns.yaml
9+
- intel-gpu-plugin.yaml
910
- bgp.yaml

0 commit comments

Comments
 (0)