-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
If we don't add code objects on demand and let KernelDB crawls the process, we end up with the following error:
Traceback (most recent call last):
File "/work1/amd/muhaawad/git/amd/audacious/maestro/examples/python/add.py", line 13, in <module>
A = torch.randn(N, device=device)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 319, in _lazy_init
torch._C._cuda_init()
RuntimeError: std::bad_alloc
Code
#!/usr/bin/env python3
import torch
# Ensure we're using ROCm and the GPU
assert torch.version.hip is not None, "This script requires ROCm."
device = torch.device("cuda")
# Define vector size
N = 1024
# Initialize vectors A and B
A = torch.randn(N, device=device)
B = torch.randn(N, device=device)
# Perform vector addition: C = A + B
C = A + B
# Optional: verify on CPU
A_cpu = A.cpu()
B_cpu = B.cpu()
C_ref = A_cpu + B_cpu
assert torch.allclose(C.cpu(), C_ref, atol=1e-5)
print("Vector addition completed successfully on ROCm GPU.")Log
[INFO]: [src/nexus.cpp:81] NEXUS_PIPE_NAME is not set. Set it to communicate with driver script.
Adding /usr/bin/python3.10
Adding linux-vdso.so.1
Adding /lib/x86_64-linux-gnu/libm.so.6
Adding /lib/x86_64-linux-gnu/libexpat.so.1
Adding /lib/x86_64-linux-gnu/libz.so.1
Adding /lib/x86_64-linux-gnu/libc.so.6
Adding /lib64/ld-linux-x86-64.so.2
Adding /usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so
Adding /lib/x86_64-linux-gnu/libffi.so.8
Adding /usr/lib/python3.10/lib-dynload/_opcode.cpython-310-x86_64-linux-gnu.so
Adding /usr/lib/python3.10/lib-dynload/_bz2.cpython-310-x86_64-linux-gnu.so
Adding /lib/x86_64-linux-gnu/libbz2.so.1.0
Adding /usr/lib/python3.10/lib-dynload/_lzma.cpython-310-x86_64-linux-gnu.so
Adding /lib/x86_64-linux-gnu/liblzma.so.5
Adding /usr/lib/python3.10/lib-dynload/_json.cpython-310-x86_64-linux-gnu.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_global_deps.so
Adding /lib/x86_64-linux-gnu/libpthread.so.0
Adding /lib/x86_64-linux-gnu/libdl.so.2
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libgomp.so
Adding /usr/local/lib/python3.10/dist-packages/torch/_C.cpython-310-x86_64-linux-gnu.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libshm.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libroctx64.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_cpu.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_hip.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libc10_hip.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libMIOpen.so
You're adding kernel ".text" which we've seen before. Something may be wrong.
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhipblaslt.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libc10.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libamdhip64.so
Adding /lib/x86_64-linux-gnu/libstdc++.so.6
Adding /lib/x86_64-linux-gnu/libgcc_s.so.1
Adding /lib/x86_64-linux-gnu/librt.so.1
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libroctracer64.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhiprtc.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhipblas.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhipfft.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhiprand.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhipsparse.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhipsolver.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libaotriton_v2.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librccl.so
You're adding kernel ".text" which we've seen before. Something may be wrong.
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libmagma.so
Adding /lib/x86_64-linux-gnu/libzstd.so.1
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libamd_comgr.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocm-core.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocblas.so
You're adding kernel ".text" which we've seen before. Something may be wrong.
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libnuma.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocprofiler-register.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libhsa-runtime64.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocsolver.so
You're adding kernel ".text" which we've seen before. Something may be wrong.
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocfft.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocrand.so
You're adding kernel ".text" which we've seen before. Something may be wrong.
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocsparse.so
You're adding kernel ".text" which we've seen before. Something may be wrong.
You're adding kernel "init_kernel()" which we've seen before. Something may be wrong.
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libsuitesparseconfig.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libcholmod.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/librocm_smi64.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libtinfo.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libelf.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libdrm.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libdrm_amdgpu.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libsatlas.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libamd.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libcamd.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libcolamd.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libccolamd.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libgfortran.so
Adding /usr/local/lib/python3.10/dist-packages/torch/lib/libquadmath.so
Adding /usr/local/lib/python3.10/dist-packages/numpy/_core/_multiarray_umath.cpython-310-x86_64-linux-gnu.so
Adding /usr/local/lib/python3.10/dist-packages/numpy/_core/../../numpy.libs/libscipy_openblas64_-6bb31eeb.so
Adding /usr/local/lib/python3.10/dist-packages/numpy/_core/../../numpy.libs/libgfortran-040039e1-0352e75f.so.5.0.0
Adding /usr/local/lib/python3.10/dist-packages/numpy/_core/../../numpy.libs/libquadmath-96973f99-934c22de.so.0.0.0
Adding /usr/lib/python3.10/lib-dynload/_contextvars.cpython-310-x86_64-linux-gnu.so
Adding /usr/local/lib/python3.10/dist-packages/numpy/linalg/_umath_linalg.cpython-310-x86_64-linux-gnu.so
Adding /usr/lib/python3.10/lib-dynload/mmap.cpython-310-x86_64-linux-gnu.so
Adding /usr/local/lib/python3.10/dist-packages/amdsmi/libamd_smi.so
Adding /usr/lib/python3.10/lib-dynload/_ssl.cpython-310-x86_64-linux-gnu.so
Adding /lib/x86_64-linux-gnu/libssl.so.3
Adding /lib/x86_64-linux-gnu/libcrypto.so.3
Adding /usr/lib/python3.10/lib-dynload/_asyncio.cpython-310-x86_64-linux-gnu.so
Adding /usr/lib/python3.10/lib-dynload/_queue.cpython-310-x86_64-linux-gnu.so
Adding /usr/lib/python3.10/lib-dynload/_hashlib.cpython-310-x86_64-linux-gnu.so
Adding /usr/lib/python3.10/lib-dynload/_uuid.cpython-310-x86_64-linux-gnu.so
Adding /lib/x86_64-linux-gnu/libuuid.so.1
Adding /usr/lib/python3.10/lib-dynload/_multiprocessing.cpython-310-x86_64-linux-gnu.so
Adding /work1/amd/muhaawad/git/amd/audacious/maestro/external/nexus/build/lib/libnexus.so
Adding /work1/amd/muhaawad/git/amd/audacious/maestro/external/nexus/build/_deps/kerneldb-build/libkernelDB64.so.1
[INFO]: [src/nexus.cpp:101] Found 2599 kernels
[INFO]: [src/nexus.cpp:107] Kernel: .text
[INFO]: [src/nexus.cpp:108] Number of lines: 0
[INFO]: [src/nexus.cpp:107] Kernel: BytePack<4> Apply_Reduce<FuncMinMax<rccl_bfloat8>, 4>::reduce<4>(FuncMinMax<rccl_bfloat8>, BytePack<4>, BytePack<4>)
[INFO]: [src/nexus.cpp:108] Number of lines: 2
[INFO]: [src/nexus.cpp:113] hipify/src/device/reduce_kernel.h:149 -> buffer_store_dword v11, off, s[0:3], s32 // 0000002758C4: E0700000 20000B00
[INFO]: [src/nexus.cpp:113] hipify/src/device/reduce_kernel.h:152 -> buffer_load_dword v11, off, s[0:3], s32 // 000000277118: E0500000 20000B00
Ending kernelDB
Found 2599 kernels.
Traceback (most recent call last):
File "/work1/amd/muhaawad/git/amd/audacious/maestro/examples/python/add.py", line 13, in <module>
A = torch.randn(N, device=device)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 319, in _lazy_init
torch._C._cuda_init()
RuntimeError: std::bad_alloc
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels