Skip to content

Commit

Permalink
fix old wrong defaults in README
Browse files Browse the repository at this point in the history
  • Loading branch information
ppalucki committed Apr 26, 2024
1 parent 9249243 commit 8ef472d
Showing 1 changed file with 14 additions and 19 deletions.
33 changes: 14 additions & 19 deletions deployment/pcm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,27 +5,22 @@ Helm chart instructions
### Features:

- privilege and non-privileged container (value: `privileged`),
- node-feature-discovery based nodeSelector and nodeAffinity (values: nfd, nfdBaremetalAffinity, nfdRDTAffinity)
- bare-metal and VM host configurations (files: values-metal.yaml, values-vm.yaml),
- Ability to deploy multiple releases alongside configured differently to handle different kinds of machines (bare-metal, VM) at the same time,
- Examples for non-privileged mode using device plugin ("smarter-devices-manager") or using NRI device-injector plugin (TODO) (file: values-smarter-devices-cpu-mem.yaml),
- Deploy Prometheus operator' PodMonitor (value: `podMonitor`)
- Integration with NRI balloons policy plugin (value: `nriBalloonsPolicyIntegration`),
- Controllable set of metrics and method of collection (RDT, uncore), support direct (msr) and indirect (Linux abstractions perf/resctrl) counter accesses (file: values-indirect.yaml)
- Linux Watchdog handling (controlled with PCM_KEEP_NMI_WATCHDOG, PCM_NO_AWS_WORKAROUND, nmiWatchdogMount values)
- Deploy to own namespace with "helm install ... **-n pcm --create-namespace**"
- Local image registry for development (file: values-local-image.yaml),

TODO/Ideas:
- [ ] Refactor extra features: node-feature-discovery, NRI interegration only as extra values for generic fields (annotations, nodeSelector/nodeAffinity)
- [ ] Check if energy metrics can be accessible through perf subsystem
- [ ] GitHub actions for linter/security scanners,
- [ ] Idea: Change metrics names (follow Prometheus best practices)
- [ ] Idea: init container to check permission for all required components (devices/CPU)
- [ ] Implement Helm chart test pods + NOTES
- [ ] Test liveness/readiness probes
- [ ] Testing in Cluster Manager Systems like (e.g. Ranger,Gardener) different node types VM(1socket,all sockets), bare-metal
- [ ] Test in different cloud GCP/Azure/AWS
#### Integration features:

- node-feature-discovery based nodeSelector and nodeAffinity (values: nfd, nfdBaremetalAffinity, nfdRDTAffinity),
- Examples for non-privileged mode using device plugin ("smarter-devices-manager") or using NRI device-injector plugin (TODO) (file: values-smarter-devices-cpu-mem.yaml),
- Integration with NRI balloons policy plugin (value: `nriBalloonsPolicyIntegration`),

#### Debugging features:

- Local image registry for development (file: values-local-image.yaml),
- Deploy Prometheus operator' PodMonitor (value: `podMonitor`)

### Getting started

Expand Down Expand Up @@ -63,11 +58,11 @@ helm upgrade --install pcm . --set privileged=false --set nfd=true --set podMoni

### Requirements

- Full set of metrics requires metal instance (uncore metrics, RDT, energy, UPI),
- Full set of metrics requires bare-metal or .metal instance (uncore metrics, RDT, energy, UPI),
- Core metrics (instructions, cycles are also available) on VM instances,
- In both case "msr" kernel module has to be loaded in host OS,
- pod is allowed to be run with privileged capabilities (SYS_ADMIN, SYS_RAWIO) on given namespace,
- Pod Security Standards allow to run on privileged level,
- /sys/fs/resctrl has to be mounted on host OS,
- pod is allowed to be run with privileged capabilities (SYS_ADMIN, SYS_RAWIO) on given namespace in other words: Pod Security Standards allow to run on privileged level,

```
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/enforce-version: latest
Expand Down

0 comments on commit 8ef472d

Please sign in to comment.