-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Sasha triage topk test failure amd #1947
[DO NOT MERGE] Sasha triage topk test failure amd #1947
Conversation
oleksandr-pavlyk
commented
Dec 22, 2024
- Have you provided a meaningful PR description?
- Have you added a test, reproducer or referred to an issue with a reproducer?
- Have you tested your changes locally for CPU and GPU devices?
- Have you made sure that new changes do not introduce compiler warnings?
- Have you checked performance impact of proposed changes?
- Have you added documentation for your changes, if necessary?
- Have you added your changes to the changelog?
- If this PR is a work in progress, are you opening the PR as a draft?
The implementation leverages existing merge-sort code, and partially sorts the array in cases where a parial sort reduces the size of temporary memory allocation
Reduces amount of casting. `k` will need to fit in `py::ssize_t` regardless.
Instead of using an overload to handle the `axis=None` case, use std::optional and check for trailing_dims_to_search in validation logic
rounded value of k must be divisible by the merge sort chunk size
Reuse that function call in sorting code-base where argsort is used.
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1947/index.html |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_368 ran successfully. |
de53055
to
3e5e303
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_370 ran successfully. |
3e5e303
to
d50092a
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Until it is passed over to the host function, and unique_ptr's ownership is released. Also reduced allocation sizes, where too much was being allocated. Introduce smart_malloc_device, etc. The smart_malloc_device<T>(count, q) makes USM allocation and returns a unique_ptr<T, USMDeleter> which owns the allocation. The function throws an exception (std::runtime_error) if USM allocation is not successful. Introduce async_smart_free. This function intends to replace use of host_task submissions to manage USM temporary deallocations. The usage is as follows: ``` // returns unique_ptr auto alloc_owner = smart_malloc_device<T>(count, q); // get raw pointer for use in kernels T *data = alloc_owner.get(); [..SNIP..] // submit host_task that releases the unique_ptr // after the host task was successfully submitted // and ownership of USM allocation is transfered to // the said host task sycl::event ht_ev = async_smart_free(q, dependent_events, alloc_owner); [...SNIP...] ```
bbb55f1
to
da3fbcc
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
d50092a
to
b411407
Compare
8214855
to
04b6629
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_343 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_344 ran successfully. |
Factored out map_back_impl projects indexing from flat index to a row-wise index. Removed dead code excluded by preprocessor conditional.
04b6629
to
dfb521f
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_346 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_347 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_348 ran successfully. |
Replaced it with hand-written implementation of ceil_log2(n), such that n <= (dectype(n){1} << ceil_log2(n)) is true for all positive values of `n` in the range.
c351d0b
to
c1f8a74
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_349 ran successfully. |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_350 ran successfully. |
0869128
to
210500f
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_344 ran successfully. |
Add check of computed against expected indices
210500f
to
387a3d9
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_345 ran successfully. |
One asserts that at least one unique pointer is specified. Another that specified arguments are unique pointers with USMDeleter.
387a3d9
to
fd65511
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_347 ran successfully. |
84d1388
to
809cb70
Compare