-
-
Notifications
You must be signed in to change notification settings - Fork 16
Zarr support #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Zarr support #190
Conversation
…dataR into keller-mark/zarr
…milar to HDF5AnnData
Simplify how obs and var names handled in ZarrAnnData
More zarr-related changes
Update comments
rcannood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work @keller-mark and @Artur-man !
I went through the PR for a first time and left some minor comments. I will review the code by running it a couple of times next :)
R/read_zarr_helpers.R
Outdated
| attrs <- g$get_attrs()$to_list() | ||
|
|
||
| if (!all(c("encoding-type", "encoding-version") %in% names(attrs))) { | ||
| path <- name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are a lot of linting issues in this file -- could you run lintr::lint_package() and fix any issues that pop up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ... did a full lint_package check and corrected some R check issues too!
inst/extdata/example2.zarr/.zgroup
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should probably be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
Errors I am getting at lint and pkgdown GHAs look unrelated to the package: |
|
@Bisaloo also the tests that are passing for macos, ubuntu and windows are failing for macos (intel) and ubuntu-bioc-release. I understand the error in release (for obvious reasons), but macos-15 I don't know. Any ideas ? https://github.com/scverse/anndataR/actions/runs/20411185095/job/58648259569?pr=190 |
|
Yes, it looks that on this runner, Rarr 1.11.9 is installed, when support for reading scalars was added in Rarr 1.11.12. I think it's related to BioC devel not having more recent binaries for macOS x86 yet. Maybe try setting: Imports:
Rarr (>= 1.11.12)In This will not work for the runner using BioC release but it's already failing anyways. |
|
I see. Then I will assume all checks are passing for now. thanks man |
|
Current issues:
|
| expect_equal(array_h5ad, array_zarr) | ||
| }) | ||
|
|
||
| test_that("reading mappings is same for h5ad and zarr", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference between rhdf5, Rarr and h5py on how rec arrays and structured data types are read.
To me, h5py and Rarr reads it correctly, but rhdf5 does not:
Huber-group-EMBL/rhdf5#192
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need an issue for this here as well? From our point of view it doesn't matter so much what is returned by {rhdf5} as long as what we return from the AnnData matches Python.
Going the other way (writing a recarray) is more difficult because we don't know when to use that format. I don't think we test this. Also, I don't think the AnnData disk spec really covers alternative arrays like recarray.
|
We are really close, tests are passing on my local but we have to wait until changes in Thanks @Bisaloo again for answering all issues of Rarr! |
Fixes #91
These changes are from both me and @Artur-man
The main public-facing changes here are:
ZarrAnnDataclassread_zarrandwrite_zarrtop-level functionsfrom_Seurat(output_class="ZarrAnnData")from_SingleCellExperiment(output_class="ZarrAnnData")Internally:
read_zarr_helpers.Ris the zarr analog ofread_h5ad_helpers.Rwrite_zarr_helpers.Ris the zarr analog ofwrite_h5ad_helpers.Rinst/extdata/example.zarr(this makes the diff noisy, apologies)test-Zarr-read.R(35 new tests)test-Zarr-write.R(70)test-ZarrAnnData.R(26)test-h5ad-zarr.R(17)A number of these functions generate warnings in the R console that are intended to be followed up on to improve the code (and should probably be resolved as end users may not appreciate them), but the tests still pass despite these warnings.
Known things that are not implemented here:
recarraysmode = c("r", "r+", "a", "w", "w-", "x")parameter value