-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out funding source/model for GPUs for Storage Optimization experiments #1387
Comments
Can we apply this to existing loads - eg. instructlab or other similar projects. This would more tightly connect the experiments to real data as opposed to synthetic data and also decrease costs. For a more synthetic set of data how much GPU time /how many GPUs are necessary for the experiments. |
@knikolla ^^ |
Talked with Michael yesterday. Will come back to this issue with a clearer number in the coming 2 weeks. |
@msdisme 1 GPU for around 80 hours. |
@msdisme responses inline
If they are available via OpenShift's production cluster sure.
Actually necessary for them to be in OpenShift production cluster. |
The V100s are in the production openshift cluster, not sure how many are actually free to be used. We can check that later. I think the only A100s in the openshift cluster are the lenovo kind. |
@naved001 do you know what kind of drives these nodes have (that can be accessed through something like |
@hakasapl do you know the drive types (or check from the iDRAC)? From within the OS it says the drive is behind the raid controller and I can't seem to install smartctl on the debug pod to get more information. These are the V100 nodes:
|
@naved001 in the spreadsheet I have them as 4x446 GiB SSD. I don't think I have access to these since they are in openstack, I probably do through the vpn but I'm not sure what their addresses are. I think Augestine did the install on these |
@hakasapl I can reach the idrac of wrk-88 (wrk-88-obm.nerc-ocp-prod.nerc.mghpcc.org), the other ones don't return an IP. But then I don't know the password of this idrac. |
Motivation
GPU resources used for storage optimization experiments aren't free and need someone to foot the bill.
Completion Criteria
Have clear agreement on who is paying for the GPU resources.
Description
Completion dates
Desired - 2024-09-25
Required - TBD
The text was updated successfully, but these errors were encountered: