Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][E2E] Fix Plugin/level_zero_usm_residency.cpp on Multi-card PVC #12844

Merged
merged 1 commit into from
Mar 4, 2024

Conversation

uditagarwal97
Copy link
Contributor

Problem:
Plugin/level_zero_usm_residency.cpp fails on Multi-card PVC due to unexpected number of calls to zeContextMakeMemoryResident. The test was expecting just one call to zeContextMakeMemoryResident upon piextUSMDeviceAlloc. However,
// RUN: env SYCL_PI_TRACE=-1 UR_L0_DEBUG=-1 %{l0_leak_check} %{run} %t.out 2>&1
produces two calls to zeContextMakeMemoryResident.

---> piextUSMDeviceAlloc(
<unknown> : 0x7ffc41c1d888
<unknown> : 0x3086680
<unknown> : 0x3084230
<unknown> : 0x7ffc41c1d8b0
<unknown> : 4
<unknown> : 4
ZE ---> zeMemAllocDevice(Context->ZeContext, &ZeDesc, Size, Alignment, Device->ZeDevice, ResultPtr)
ZE ---> zeDeviceCanAccessPeer(D->ZeDevice, Device->ZeDevice, &P2P)
ZE ---> zeContextMakeMemoryResident(Context->ZeContext, D->ZeDevice, Ptr, Size)
ZE ---> zeContextMakeMemoryResident(Context->ZeContext, D->ZeDevice, Ptr, Size)
) --->  pi_result : PI_SUCCESS
        [out]<unknown> ** : 0x7ffc41c1d888[ 0xff00fffffffe0000 ... ]

Solution:
The default value of SYCL_PI_LEVEL_ZERO_USM_RESIDENT is 0x002, which makes memory residence on all devices, which IIUC, means on all tiles of a multi-card PVC. That's why we see two calls to zeContextMakeMemoryResident, one for each tile of a multi-tile PVC.

This doesn't look like a bug in the code, but in the test instead. This PR updates the test case to also accept more than one calls to zeContextMakeMemoryResident for device allocations.

@uditagarwal97
Copy link
Contributor Author

@nrspruit a gentle reminder.

Copy link
Contributor

@nrspruit nrspruit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, LGTM, thank you for the change!

@uditagarwal97
Copy link
Contributor Author

The following failure in pre-commit CI is unrelated and reported in #12896

********************
Failed Tests (1):
  SYCL :: ESIMD/dword_local_accessor_atomic_smoke.cpp

@uditagarwal97
Copy link
Contributor Author

@intel/llvm-gatekeepers the PR is ready!

@againull againull merged commit ed5ff8b into sycl Mar 4, 2024
12 of 13 checks passed
@bader bader deleted the uditagarwal97-patch-2 branch March 5, 2024 00:18
againull pushed a commit that referenced this pull request Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants