Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dockerfile with ROCm support #27

Closed

Conversation

jakki-amd
Copy link
Collaborator

@jakki-amd jakki-amd commented Nov 21, 2024

Description

Add Dockerfile with ROCm support

Fixes #4

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

  • Regression test
FAILED test_handler.py::test_huggingface_bert_model_parallel_inference

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@jakki-amd jakki-amd marked this pull request as draft November 21, 2024 12:12
@smedegaard
Copy link
Collaborator

tested on AMD node with

DOCKER_BUILDKIT=1 docker build --build-arg ROCM_VERSION=rocm62 --build-arg MACHINE_TYPE=gpu -t rocm-testing -f ./docker/Dockerfile.rocm .

docker run --device=/dev/kfd --device=/dev/dri -e JAVA_TOOL_OPTIONS="-Dlogging.level.root=DEBUG" -e HIP_VISIBLE_DEVICES=6 -e PYTHONLOG=DEBUG -it rocm-testing /bin/bash

python3 /serve/test/regression_tests.py

Result

┌─────────────────────────┬─────────────────────┬────────────────────┐
│                         │            executed │             failed │
├─────────────────────────┼─────────────────────┼────────────────────┤
│              iterations │                  11 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│                requests │                  11 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│            test-scripts │                  11 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│      prerequest-scripts │                   0 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│              assertions │                  11 │                  0 │
├─────────────────────────┴─────────────────────┴────────────────────┤
│ total run duration: 14s                                            │
├────────────────────────────────────────────────────────────────────┤
│ total data received: 1.7kB (approx)                                │
├────────────────────────────────────────────────────────────────────┤
│ average response time: 1252ms [min: 6ms, max: 10.6s, s.d.: 3s]     │
├────────────────────────────────────────────────────────────────────┤
│ average DNS lookup time: 852µs [min: 156µs, max: 1ms, s.d.: 616µs] │
├────────────────────────────────────────────────────────────────────┤
│ average first byte time: 1247ms [min: 3ms, max: 10.6s, s.d.: 3s]   │
└────────────────────────────────────────────────────────────────────┘

Copy link
Collaborator

@smedegaard smedegaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakki-amd jakki-amd marked this pull request as ready for review November 21, 2024 13:13
Copy link
Collaborator

@smedegaard smedegaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docs ❤️

docker/Dockerfile.rocm Outdated Show resolved Hide resolved
docker/Dockerfile.rocm Outdated Show resolved Hide resolved
docker/Dockerfile.rocm Outdated Show resolved Hide resolved
docker/Dockerfile.rocm Outdated Show resolved Hide resolved
@jakki-amd jakki-amd force-pushed the 4-docker-support-rocm branch from 0853fc1 to d9e142d Compare November 22, 2024 13:45
@jakki-amd jakki-amd self-assigned this Nov 22, 2024
@jakki-amd jakki-amd force-pushed the 4-docker-support-rocm branch from b592621 to 7f9df40 Compare November 26, 2024 12:04
@jakki-amd jakki-amd requested a review from smedegaard November 27, 2024 12:52
@jakki-amd jakki-amd force-pushed the 4-docker-support-rocm branch from fe27a74 to f9afc30 Compare November 27, 2024 13:29
ravi9 and others added 2 commits December 5, 2024 20:24
* Add AMD backend support

* Add AMD frontend support

* Add Dockerfile.rocm

Co-authored-by: Samu Tamminen <stammine@amd.com>

* Add AMD documentation

* Fix null pointer bug with populateAccelerators trying to get null AppleUtil GPU env value

* Fix formatting

---------

Co-authored-by: Rony Leppänen <rleppane@amd.com>
Co-authored-by: Anders Smedegaard Pedersen <asmedega@amd.com>
Co-authored-by: Samu Tamminen <stammine@amd.com>
@jakki-amd jakki-amd force-pushed the 4-docker-support-rocm branch 2 times, most recently from 251a6df to 8d78419 Compare December 20, 2024 13:27
jakki-amd and others added 3 commits December 20, 2024 19:13
* Add Apple system metrics support

Co-authored-by: Bipradip Chowdhury <bichowdh@amd.com>
Co-authored-by: Rony Leppänen <rleppane@amd.com>
Co-authored-by: Anders Smedegaard Pedersen <asmedega@amd.com>

* Fix ModelServerTest.testMetricManager for other HW vendors

* Add GPUUtilization as expect metric

---------

Co-authored-by: Bipradip Chowdhury <bichowdh@amd.com>
Co-authored-by: Rony Leppänen <rleppane@amd.com>
Co-authored-by: Anders Smedegaard Pedersen <asmedega@amd.com>
@jakki-amd jakki-amd force-pushed the 4-docker-support-rocm branch from d37ed5c to a580bca Compare January 2, 2025 15:24
@jakki-amd jakki-amd closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants