-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is e.g. decode_into
unsafe? Why UnsafeCellSlice
?
#133
Comments
I don't think I can take a
|
I didn’t dive into this enough to understand: what prevents this from being written in 100% safe Rust, taking a E.g. you can use |
Ok these have to stay unsafe actually, I'm missing writing the most important invariant that output subsets must be non-overlapping. These methods have this signature so that threads can write to non overlapping multidimensional array subsets in parallel. If you can find a way to do that in safe Rust, I'd be interested. |
these (and of course slice.chunks and other similarly named APIs) are the primary way I know how to make a long slice into multiple subslices, but I bet there are other APIs for other types.
Non-overlapping? Then there is no problem! doing that safely is pretty much one of Rust’s prime promises, I’m very convinced that you don‘t need any unsafe code here. E.g.: https://docs.rs/rayon/latest/rayon/slice/trait.ParallelSliceMut.html#method.par_chunks_mut |
OK I took a deeper look. With “multidimensional”, you mean that the output slices belonging to a chunk are interleaved right? So when retrieving e.g. the first chunk of 2D data, we have to write its first row to the start of the output array, then skip It doesn’t matter. My point is that you wrote a huge amount of unsafe code, when in reality, the only unsafe part is how arguments are passed, i.e. all functions have access to the whole output array and not just the part(s) they’re actually allowed to access. This is what I’d like to see changed. The idea would be to write a helper parallel iterator which spits out tuples of (chunk_retrieval_info, writeable_memory)1. Its implementation might be safe, or not, point is even if it’s unsafe, it would be the only unsafe spot in the code base that deals with keeping input and output aligned. The result would probably have some nice code deduplication because the closures passed to rayon would no longer have to do their own offset calculations. Footnotes |
This is the crux of all this: allocate an output, write directly into it parallelised over chunks. zarrs/zarrs/src/array/array_sync_readable.rs Lines 710 to 755 in 2f5a665
Yes, but not quite. The iteration is not so simple for an arbitrary subset of a multidimensional array. E.g. with a 2(planes)*4(columns)*4(rows) array chunked as 2x2x2:
The offsets of chunk
I cannot picture a parallel iterator that does not use
I agree with your sentiment that it would be much better if the |
the focus of my idea is this:
all other functions downstream from there (like
I’m pretty sure the performance penalty will be neglegible or non-existent. We have to calculate the offsets anyway. It makes no difference if we
2. would look something like fn chunks_mut(&[ChunkInfo], output: &mut [u8]) -> impl IntoParallelIterator<Item=(
&ChunkInfo, // for retrieval
impl IntoIterator<&mut [u8]>, // for storage, none of these overlap
)> {
// almost all of the unsafe code of the crate here, super well tested
} (but of course as actual traits or stucts that allow accessing more relevant data. And of course we could also do things like pairing fetch offsets with output slices in the iterator item) the amount of calculation and the amount of information stored in memory at any point is exactly the same, while at no point we have to pass on |
Thanks for raising this @flying-sheep, I think I am on to something with #136. I've made the |
First off: I’m sorry that I’m all talk and no contribution here until now, if you want me to take a swing, I’d be happy to! That being said: awesome! I think the gradual approach makes much more sense than to try to make as much safe as possible in one big patch. After what you described, I think it would be possible to make an abstraction around patterns like the following, i.e. make it possible to safely iterate over all sub-chunks of a Both of these patterns should always be safe, since “iterate over all possible chunks” always creates disjoint chunks. let decode_chunk = |chunk_index: usize| {
let output_subset_chunk = … chunk_index …;
let mut output_view_inner_chunk =
unsafe { output_view.subdivide_unchecked(output_subset_chunk) };
…
};
iter_concurrent_limit!(
shard_concurrent_limit,
(0..num_chunks),
try_for_each,
decode_chunk
)?; and let retrieve_shard_into_slice = |shard_indices: Vec<u64>| {
let mut output_view = unsafe {
ArrayBytesFixedDisjointView::new_unchecked(…)
};
…
};
let indices = shards.indices();
iter_concurrent_limit!(
chunk_concurrent_limit,
indices,
try_for_each,
retrieve_shard_into_slice
)?; |
I wonder why the API takes
UnsafeCellSlice<'_, u8>
instead of just&mut [u8]
, and why it’s unsafe.Assuming it’s eliding bound checks for performance: shouldn’t
decode_into
be safe and if really necessary for performance, we could add adecode_into_unchecked
?The text was updated successfully, but these errors were encountered: