-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add iota kernel #1946
Add iota kernel #1946
Conversation
Reuse that function call in sorting code-base where argsort is used.
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_336 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_337 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_338 ran successfully. |
29d7198
to
f1b2045
Compare
f1b2045
to
51ead2b
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_339 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Until it is passed over to the host function, and unique_ptr's ownership is released. Also reduced allocation sizes, where too much was being allocated. Introduce smart_malloc_device, etc. The smart_malloc_device<T>(count, q) makes USM allocation and returns a unique_ptr<T, USMDeleter> which owns the allocation. The function throws an exception (std::runtime_error) if USM allocation is not successful. Introduce async_smart_free. This function intends to replace use of host_task submissions to manage USM temporary deallocations. The usage is as follows: ``` // returns unique_ptr auto alloc_owner = smart_malloc_device<T>(count, q); // get raw pointer for use in kernels T *data = alloc_owner.get(); [..SNIP..] // submit host_task that releases the unique_ptr // after the host task was successfully submitted // and ownership of USM allocation is transfered to // the said host task sycl::event ht_ev = async_smart_free(q, dependent_events, alloc_owner); [...SNIP...] ```
bbb55f1
to
da3fbcc
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_341 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_338 ran successfully. |
Replaced three duplicates of the same kernel with calls to this function.
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_339 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Factored out map_back_impl projects indexing from flat index to a row-wise index. Removed dead code excluded by preprocessor conditional.
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_341 ran successfully. |
Ping @AlexanderKalistratov |
Replaced it with hand-written implementation of ceil_log2(n), such that n <= (dectype(n){1} << ceil_log2(n)) is true for all positive values of `n` in the range.
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_342 ran successfully. |
Add check of computed against expected indices
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_344 ran successfully. |
One asserts that at least one unique pointer is specified. Another that specified arguments are unique pointers with USMDeleter.
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_346 ran successfully. |
I suggest we exclude these failing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, we can merge this into the topk branch and drop the test file PR, then remove the commit that adds test_top_k_largest_1d_radix_i1
This PR builds on top of feature/topk branch.
It adds
iota_impl
in newsort_utils.hpp
file, and uses it inmerge_sort.hpp
,radix_sort.hpp
andtopk.hpp
.It also fixes possible USM allocation leak in exception handling.