Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] Support for Graph Node Profiling #12592

Closed
wants to merge 32 commits into from
Closed
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
c032cf9
initial commit (does not work)
mfrancepillois Jan 22, 2024
ae32544
Merge branch 'sycl' into maxime/profiling_v2
mfrancepillois Jan 22, 2024
f63acca
Add sycl event entry point for node profiling.
mfrancepillois Jan 25, 2024
c189890
Merge branch 'sycl' into maxime/profiling_v2
mfrancepillois Jan 25, 2024
53b76bb
fixes format
mfrancepillois Jan 25, 2024
421f303
Renames query function to `ext_oneapi_get_profiling_info()`
mfrancepillois Jan 26, 2024
99c7bf9
Code format improvement
mfrancepillois Jan 26, 2024
a7b839b
Updates comments
mfrancepillois Jan 29, 2024
ed0bc28
Updates comments + corrects typo
mfrancepillois Jan 29, 2024
fbbc822
Moves tests to Explicit directory
mfrancepillois Jan 31, 2024
7ff2327
Merge branch 'sycl' into maxime/profiling_v2
mfrancepillois Jan 31, 2024
96812b9
Updates UR branch
mfrancepillois Feb 1, 2024
f589d9b
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Feb 1, 2024
262b44a
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Feb 2, 2024
46bce9c
Updates function name
mfrancepillois Feb 2, 2024
30ab2fe
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Feb 2, 2024
df99396
Corrects pi function name
mfrancepillois Feb 2, 2024
7d86fb2
Adds missing symbols
mfrancepillois Feb 2, 2024
190136b
Adds windows symbols + corrects a typo
mfrancepillois Feb 2, 2024
7d21aba
Updates spec + format
mfrancepillois Feb 5, 2024
19102ef
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Feb 13, 2024
81efcc4
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Feb 13, 2024
1edcd01
Add cpu native symbol
mfrancepillois Feb 14, 2024
f52745e
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Feb 14, 2024
78d9d22
Merge branch 'sycl' into maxime/profiling_v2
mfrancepillois Feb 21, 2024
69d8dbb
Improve getSyncPointFromNode function to search for original and dupl…
mfrancepillois Feb 21, 2024
74c5295
Merge branch 'sycl-upstream' into maxime/profiling_v2
mfrancepillois Mar 6, 2024
2af6f28
Update new Tests
mfrancepillois Mar 6, 2024
7b8f9bb
Typos
mfrancepillois Mar 11, 2024
d58e1a5
more typos
mfrancepillois Mar 11, 2024
f453a09
Merge branch 'sycl' into maxime/profiling_v2
mfrancepillois Mar 11, 2024
1c6aa95
move tests to Profiling directory
mfrancepillois Mar 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 42 additions & 2 deletions sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1200,6 +1200,44 @@ Exceptions:

|===

==== New Event Member Functions

Table {counter: tableNumber}. Additional member functions of the `sycl::event` class.
[cols="2a,a"]
|===
|Member function|Description

|
[source,c++]
----
template <typename Param>
typename Param::return_type
event::ext_oneapi_get_profiling_info(node Node) const;
----

| Queries the profiling information of a SYCL Graph node for the graph
execution associated with this SYCL event. If the requested info is
not available when this member function is called due to incompletion of
graph execution associated with the event, then the call to this member
function will block until the requested info is available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the following aspect as a different thread:
We also need to consider kernel/node fusion as an optimization strategy. In this case we might not have the exact profiling info.
@sommerlukas what are your thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two possible ways to handle this:

  • Return the same profiling info for all nodes that were fused. So if you use this new API with two different nodes, you would get the same information and that information would correspond to the profiling information of the fused kernel.
  • Disallow to query for profiling information if fusion was applied.

I believe that profiling info for the fused kernel would be valuable for the user, so I would prefer the first option here.

On some backends, it might even be possible to get profiling information for individual parts of the fused kernel corresponding to the original nodes, but I don't think this would be portable and implementation could quickly get very tricky, so I would prefer to not include that in the extension proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback.
I agree that it might be interesting for users to get node profiling information when using the kernel fusion feature.
But the proposed profiling API is based on the level-zero function zeCommandListAppendQueryKernelTimestamps (https://spec.oneapi.io/level-zero/latest/core/api.html#zecommandlistappendquerykerneltimestamps), which in turn relies on the ZeEvents that were linked to the kernels when they were enqueued to their command-list. So I don't know what happens to this ZeEvents when the kernels are fused.
Could you tell me where I can find more information on the implementation of the kernel fusion feature?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kernel fusion happens in the SYCL RT, even before commands are submitted to the underlying backend.

So, in this case, fusion would happen before anything is passed to the LevelZero backend. Instead, a single command for the execution of the fused kernel would be passed to the LevelZero backend and inserted into the command-list.

Obtaining profiling information for the execution of the fused kernel could use the same API and the zeEvent returned for the command corresponding to the fused kernel execution.

There is a design document with more information here: https://github.com/intel/llvm/blob/sycl/sycl/doc/design/KernelFusionJIT.md

In the current implementation, is the backend (ZE) command list only created upon call to command_graph<...>::finalize or already before that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command-buffers that instantiate the command-list for LevelZero backend are only created during the command_graph<...>::finalize process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in that case it would be a viable implementation option to first perform fusion and only then create the command buffer inside command_graph<...>::finalize. The command-buffer would only contain the fused kernel command (and memory commands if necessary) and it would be possible to return a zeEvent that could be used for profiling of the fused kernel execution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that profiling info for the fused kernel would be valuable for the user, so I would prefer the first option here.

I agree that profiling information is useful for a fused graph as a whole, but I don't see how it's useful to attempt to provide this profiling information on a per-node basis. How can the application know what the profiling information means? Won't it depend on the implementation's ability to fuse each node?

Let's consider that an application calls ext_oneapi_get_profiling_info on a node in a fused graph. The returned information might tell the time it took to execute that one node (if the node could not be fused), it might tell the time it took to execute several nodes in the graph (if the node was fused with some of its neighbors), or it might tell the time it took to execute the entire graph (if all the nodes in the graph were fused). With so much uncertainty, I don't see why this is a useful feature.

I think it does make sense to let the application get profiling information for the entire graph, but the application can do this already using the event that is returned from queue::ext_oneapi_graph.

I think ext_oneapi_get_profiling_info should simply fail if the graph was created with fusion enabled. Per-node profiling information is not available for a fused graph.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be worth noting that the SYCL 2020 wording is phrased in terms of "command groups".

Getting the profiling information for the command group in which the graph was submitted, where graph execution is the "action", is consistent with the specification wording, and is always well defined. Getting the profiling information for nodes in the graph seems less consistent to me: the mapping of nodes to command groups and "actions" is unclear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first read this, I was confused about which event the application would use when calling this API because there are two relevant events. If the graph is created by recording a queue, each recorded node has an event. You also get an event when you submit the graph to a queue via queue::ext_oneapi_graph.

It took me a while to figure out that applications are expected to use ext_oneapi_get_profiling_info on the second event (the one returned from queue::ext_oneapi_graph). I think it would be helpful to make this more clear somehow in the description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EwanC asked in #12838 how ext_oneapi_get_profiling_info should relate to the new submit_profiling_tag API.

I think the graph extension could add a new member function to command_graph:

node add_profiling_tag();

This function would fail unless the graph's device had the aspect ext_oneapi_queue_profiling_tag. Once the application uses this API to add a profiling tag to the graph, it can get the profiling information later via ext_oneapi_get_profiling_info.


Parameters:

* `Node` - Node object for which the profiling information is being queried.

Exceptions:

* Throws synchronously with error code `invalid` if this SYCL event is not
associated with a graph execution.
* Throws synchronously with error code `invalid` if the queue on which
the graph was submitted was not constructed with
the `property::queue::enable_profiling` property.
* Throws synchronously with error code `invalid` if `Node` is not associated
with the graph exectution represented by this event.

|===


=== Thread Safety

The new functions in this extension are thread-safe, the same as member
Expand Down Expand Up @@ -1899,8 +1937,10 @@ if used in application code.

. Using reductions in a graph node.
. Using sycl streams in a graph node.
. Profiling an event returned from graph submission with
`event::get_profiling_info()`.
. Profiling information is not available for graphs that contain host-task nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This restriction seems like another good reason not to expose per-node profiling information. I don't understand why the presence of a host task would impact the runtime's ability to reason about a different device kernel, and I think many users would struggle to understand this too.

It's also unclear to me whether the presence of a host task disables all profiling. What is the intent here? Is it still possible to use the event returned by the graph submission to reason about how long the entire graph took to execute?

. Profiling a node from an event returned from graph submission with
`event::ext_oneapi_get_profiling_info(ext::node)` is only available for
the level-zero backend.
Comment on lines +1923 to +1925
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think this is surprising. Removing the per-node profiling completely would fix this.

. Synchronization between multiple executions of the same command-buffer
must be handled in the host for level-zero backend, which may involve
extra latency for subsequent submissions.
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/detail/pi.def
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,7 @@ _PI_API(piextCommandBufferFillUSM)
_PI_API(piextCommandBufferPrefetchUSM)
_PI_API(piextCommandBufferAdviseUSM)
_PI_API(piextEnqueueCommandBuffer)
_PI_API(piextSyncPointGetProfilingInfo)

_PI_API(piextUSMPitchedAlloc)

Expand Down
21 changes: 20 additions & 1 deletion sycl/include/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -154,9 +154,10 @@
// 15.44 Add coarse-grain memory advice flag for HIP.
// 15.45 Added piextKernelSuggestMaxCooperativeGroupCount and
// piextEnqueueCooperativeKernelLaunch.
// 15.46 Added piextSyncPointGetProfilingInfo

#define _PI_H_VERSION_MAJOR 15
#define _PI_H_VERSION_MINOR 45
#define _PI_H_VERSION_MINOR 46

#define _PI_STRING_HELPER(a) #a
#define _PI_CONCAT(a, b) _PI_STRING_HELPER(a.b)
Expand Down Expand Up @@ -2601,6 +2602,24 @@ piextEnqueueCommandBuffer(pi_ext_command_buffer command_buffer, pi_queue queue,
pi_uint32 num_events_in_wait_list,
const pi_event *event_wait_list, pi_event *event);

/// API to get the profiling information of a graph node.
/// A Node is identified by a sync-point in a command-buffer.
/// The sync-point passed in parameter corresponds therefore to the node from
/// which we want to get the profiling information. returns an error if the node
/// is found.
/// \param event PI event that has been returned from the command-buffer
/// submission.
/// \param sync_point The sync-point corresponding to the node from which
/// we want to get the profiling information.
/// \param param_name The name of the profiling property to query depends on.
/// \param param_value_size Size in bytes of the profiling property value.
/// \param param_value Value of the profiling property.
/// \param param_value_size_ret Pointer to the actual size in bytes returned
/// in param_value of the profiling property.
__SYCL_EXPORT pi_result piextSyncPointGetProfilingInfo(
pi_event event, pi_ext_sync_point sync_point, pi_profiling_info param_name,
size_t param_value_size, void *param_value, size_t *param_value_size_ret);

/// API to destroy bindless unsampled image handles.
///
/// \param context is the pi_context
Expand Down
20 changes: 20 additions & 0 deletions sycl/include/sycl/event.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include <sycl/detail/info_desc_helpers.hpp> // for is_event_info_desc, is_...
#include <sycl/detail/owner_less_base.hpp> // for OwnerLessBase
#include <sycl/detail/pi.h> // for pi_native_handle
#include <sycl/ext/oneapi/experimental/graph.hpp>

#ifdef __SYCL_INTERNAL_API
#include <sycl/detail/cl.h>
Expand Down Expand Up @@ -130,6 +131,25 @@ class __SYCL_EXPORT event : public detail::OwnerLessBase<event> {
typename detail::is_event_profiling_info_desc<Param>::return_type
get_profiling_info() const;

/// Queries the profiling information of a SYCL Graph node for the graph
/// execution associated with this SYCL event.
///
/// If this SYCL event is not associated with a graph execution, an
/// invalid_object_error SYCL exception is thrown. If the requested info is
/// not available when this member function is called due to incompletion of
/// command groups associated with the event, then the call to this member
/// function will block until the requested info is available. If the queue
/// which submitted the command group this event is associated with was not
/// constructed with the property::queue::enable_profiling property, an
/// invalid_object_error SYCL exception is thrown.
///
/// \param Node Node object for which the profiling information
/// is being queried.
/// \return depends on template parameter.
template <typename Param>
typename detail::is_event_profiling_info_desc<Param>::return_type
ext_oneapi_get_profiling_info(ext::oneapi::experimental::node Node) const;

/// Returns the backend associated with this platform.
///
/// \return the backend associated with this platform
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
__SYCL_PARAM_TRAITS_SPEC_PARAMT(event_profiling, command_submit, ext::oneapi::experimental::node, uint64_t, PI_PROFILING_INFO_COMMAND_SUBMIT)
__SYCL_PARAM_TRAITS_SPEC_PARAMT(event_profiling, command_start, ext::oneapi::experimental::node, uint64_t, PI_PROFILING_INFO_COMMAND_START)
__SYCL_PARAM_TRAITS_SPEC_PARAMT(event_profiling, command_end, ext::oneapi::experimental::node, uint64_t, PI_PROFILING_INFO_COMMAND_END)
9 changes: 9 additions & 0 deletions sycl/include/sycl/info/info_desc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@ namespace info {
struct Desc { \
using return_type = ReturnT; \
};
#define __SYCL_PARAM_TRAITS_SPEC_PARAMT(DescType, Desc, ParamType, ReturnT, \
PiCode) \
struct Desc { \
using return_type = ReturnT; \
};
// A.1 Platform information desctiptors
namespace platform {
// TODO Despite giving this deprecation warning, we're still yet to implement
Expand Down Expand Up @@ -155,7 +160,11 @@ namespace event {
namespace event_profiling {
#include <sycl/info/event_profiling_traits.def>
} // namespace event_profiling
namespace ext_oneapi_event_profiling {
#include <sycl/info/ext_oneapi_graph_node_profiling_traits.def>
} // namespace ext_oneapi_event_profiling
#undef __SYCL_PARAM_TRAITS_SPEC
#undef __SYCL_PARAM_TRAITS_SPEC_PARAMT

// Provide an alias to the return type for each of the info parameters
template <typename T, T param> class param_traits {};
Expand Down
8 changes: 8 additions & 0 deletions sycl/plugins/cuda/pi_cuda.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1208,6 +1208,14 @@ pi_result piextEnqueueCommandBuffer(pi_ext_command_buffer CommandBuffer,
CommandBuffer, Queue, NumEventsInWaitList, EventWaitList, Event);
}

pi_result piextSyncPointGetProfilingInfo(
pi_event Event, pi_ext_sync_point SyncPoint, pi_profiling_info ParamName,
size_t ParamValueSize, void *ParamValue, size_t *ParamValueSizeRet) {
return pi2ur::piextSyncPointGetProfilingInfo(Event, SyncPoint, ParamName,
ParamValueSize, ParamValue,
ParamValueSizeRet);
}

pi_result piextPluginGetOpaqueData(void *opaque_data_param,
void **opaque_data_return) {
return pi2ur::piextPluginGetOpaqueData(opaque_data_param, opaque_data_return);
Expand Down
8 changes: 8 additions & 0 deletions sycl/plugins/hip/pi_hip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1211,6 +1211,14 @@ pi_result piextEnqueueCommandBuffer(pi_ext_command_buffer CommandBuffer,
CommandBuffer, Queue, NumEventsInWaitList, EventWaitList, Event);
}

pi_result piextSyncPointGetProfilingInfo(
pi_event event, pi_ext_sync_point sync_point, pi_profiling_info param_name,
size_t param_value_size, void *param_value, size_t *param_value_size_ret) {
return pi2ur::piextSyncPointGetProfilingInfo(event, sync_point, param_name,
param_value_size, param_value,
param_value_size_ret);
}

pi_result piextPluginGetOpaqueData(void *opaque_data_param,
void **opaque_data_return) {
return pi2ur::piextPluginGetOpaqueData(opaque_data_param, opaque_data_return);
Expand Down
8 changes: 8 additions & 0 deletions sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1369,6 +1369,14 @@ pi_result piextEnqueueCommandBuffer(pi_ext_command_buffer CommandBuffer,
CommandBuffer, Queue, NumEventsInWaitList, EventWaitList, Event);
}

pi_result piextSyncPointGetProfilingInfo(
pi_event Event, pi_ext_sync_point SyncPoint, pi_profiling_info ParamName,
size_t ParamValueSize, void *ParamValue, size_t *ParamValueSizeRet) {
return pi2ur::piextSyncPointGetProfilingInfo(Event, SyncPoint, ParamName,
ParamValueSize, ParamValue,
ParamValueSizeRet);
}

const char SupportedVersion[] = _PI_LEVEL_ZERO_PLUGIN_VERSION_STRING;

pi_result piPluginInit(pi_plugin *PluginInit) { // missing
Expand Down
8 changes: 8 additions & 0 deletions sycl/plugins/native_cpu/pi_native_cpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1177,6 +1177,14 @@ pi_result piextEnqueueCommandBuffer(pi_ext_command_buffer CommandBuffer,
CommandBuffer, Queue, NumEventsInWaitList, EventWaitList, Event);
}

pi_result piextSyncPointGetProfilingInfo(
pi_event Event, pi_ext_sync_point SyncPoint, pi_profiling_info ParamName,
size_t ParamValueSize, void *ParamValue, size_t *ParamValueSizeRet) {
return pi2ur::piextSyncPointGetProfilingInfo(Event, SyncPoint, ParamName,
ParamValueSize, ParamValue,
ParamValueSizeRet);
}

pi_result piextPluginGetOpaqueData(void *opaque_data_param,
void **opaque_data_return) {
return pi2ur::piextPluginGetOpaqueData(opaque_data_param, opaque_data_return);
Expand Down
8 changes: 8 additions & 0 deletions sycl/plugins/opencl/pi_opencl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1147,6 +1147,14 @@ pi_result piextEnqueueCommandBuffer(pi_ext_command_buffer CommandBuffer,
CommandBuffer, Queue, NumEventsInWaitList, EventWaitList, Event);
}

pi_result piextSyncPointGetProfilingInfo(
pi_event Event, pi_ext_sync_point SyncPoint, pi_profiling_info ParamName,
size_t ParamValueSize, void *ParamValue, size_t *ParamValueSizeRet) {
return pi2ur::piextSyncPointGetProfilingInfo(Event, SyncPoint, ParamName,
ParamValueSize, ParamValue,
ParamValueSizeRet);
}

pi_result piextPluginGetOpaqueData(void *opaque_data_param,
void **opaque_data_return) {
return pi2ur::piextPluginGetOpaqueData(opaque_data_param, opaque_data_return);
Expand Down
11 changes: 2 additions & 9 deletions sycl/plugins/unified_runtime/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,15 +56,8 @@ endif()
if(SYCL_PI_UR_USE_FETCH_CONTENT)
include(FetchContent)

set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git")

# commit a2757b2931daa2f8d7c9dd51b0fc846be1fd49a7
# Merge: 9b936b5 + f78d369
# Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
# Date: Tue Feb 27 11:34:58 2024 +0000
# Merge pull request #1254 from Bensuo/cmdbuf-support-hip
# [EXP][CMDBUF] HIP adapter support for command buffers
set(UNIFIED_RUNTIME_TAG a2757b2931daa2f8d7c9dd51b0fc846be1fd49a7 )
set(UNIFIED_RUNTIME_REPO "https://github.com/bensuo/unified-runtime.git")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to set this back to the original before merging.

set(UNIFIED_RUNTIME_TAG cmdbuf-profiling-sync-point)

if(SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO)
set(UNIFIED_RUNTIME_REPO "${SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO}")
Expand Down
37 changes: 37 additions & 0 deletions sycl/plugins/unified_runtime/pi2ur.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4235,6 +4235,43 @@ inline pi_result piEventGetProfilingInfo(pi_event Event,
return PI_SUCCESS;
}

inline pi_result piextSyncPointGetProfilingInfo(
pi_event Event, pi_ext_sync_point SyncPoint, pi_profiling_info ParamName,
size_t ParamValueSize, void *ParamValue, size_t *ParamValueSizeRet) {

PI_ASSERT(Event, PI_ERROR_INVALID_EVENT);

ur_event_handle_t UREvent = reinterpret_cast<ur_event_handle_t>(Event);

ur_profiling_info_t PropName{};
switch (ParamName) {
case PI_PROFILING_INFO_COMMAND_QUEUED: {
PropName = UR_PROFILING_INFO_COMMAND_QUEUED;
break;
}
case PI_PROFILING_INFO_COMMAND_SUBMIT: {
PropName = UR_PROFILING_INFO_COMMAND_SUBMIT;
break;
}
case PI_PROFILING_INFO_COMMAND_START: {
PropName = UR_PROFILING_INFO_COMMAND_START;
break;
}
case PI_PROFILING_INFO_COMMAND_END: {
PropName = UR_PROFILING_INFO_COMMAND_END;
break;
}
default:
return PI_ERROR_INVALID_PROPERTY;
}

HANDLE_ERRORS(urEventGetSyncPointProfilingInfoExp(
UREvent, SyncPoint, PropName, ParamValueSize, ParamValue,
ParamValueSizeRet));

return PI_SUCCESS;
}

inline pi_result piEventCreate(pi_context Context, pi_event *RetEvent) {

ur_context_handle_t UrContext =
Expand Down
Loading