-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support indexing parray by ndarray #89
Open
yinengy
wants to merge
3
commits into
main
Choose a base branch
from
fix/parray-index-array
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this inefficient as:
This could lead to a lot of device -> host copies from the crosspy calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could only avoid by caller since this information has to be saved in cpu side that is becase it will later be converted and hashed so it could be recorded for further use within parray. (so indexing by cupy array is not efficient in any senario since parray's internal data structure is still running in cpu)
And I think crosspy should always use numpy array as index, what is the benefit of using cupy array as index? @bozhiyou
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, indices could be as large as the array itself - think of the permutation example
arr[shuffle(arange(len(arr))]
- and even larger. If you are indexing a cupy array, the indices array has to be copied to GPU as a cupy array.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Will that list will be inefficient. Storing all the information on CPU seems to be inefficient as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, but currently there is no efficient way to deal with this issue like simply storing the info in GPU or as ndarray (actually all lists will be converted to hash maps in the later step since there is no gurantee the slice mapping is in sequential when you deal with local indices). And each access to array need to query these information (e.g. comparing slices with given slices). Consider the following scenario:
(a) If we chose to put it on the same device as where data is, when there are multiple copy on different device, each device now has part of the slicing saved, every access regardless where the data is has to access all devices and compare slice information that saved locally, that will be really slow.
(b) if we chose to put it on the first gpu, operation happened for copy on other devices has to query this gpu to, that is the same as query data in cpu since nvlink is not guranted to be exist between all device and this request will be routed via cpu, which is slower than put it on cpu. This also raise memory issues when the slices is large (you mentioned it might be as large as array), GPU memory are more limited than cpu and runtime has to track the slice size to overwise new data moved there will be OOM, and this also cause imbalanced workload and memory size between devices.
until we found a good solution to the above issues, CPU is still the best choice to store the slices. (but I could make it numpy array instead of python list)