Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline loaded variables into kerchunk references #73

Merged
merged 25 commits into from
May 16, 2024
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d27ba38
add test vs kerchunk inlining
TomNicholas Apr 5, 2024
5733185
Merge branch 'main' into inline_chunkdata
TomNicholas Apr 5, 2024
af46998
encode inlined data correctly (at least for standard numpy arrays)
TomNicholas Apr 5, 2024
490007b
don't test time variable for now
TomNicholas Apr 5, 2024
9041d53
store fill_value as np.NaN, and always coerce it
TomNicholas Apr 6, 2024
00a9b62
test passing for lat and lon variables
TomNicholas Apr 6, 2024
3973c72
Merge branch 'main' of https://github.com/TomNicholas/VirtualiZarr in…
TomNicholas Apr 6, 2024
cd6ca47
Merge branch 'main' into inline_chunkdata
TomNicholas Apr 6, 2024
65fb3d1
formatting
TomNicholas Apr 8, 2024
aa754f4
Merge branch 'main' into inline_chunkdata
TomNicholas May 2, 2024
a4c1f1b
encode numpy types
TomNicholas May 2, 2024
e59428e
tidy internal import
TomNicholas May 3, 2024
7c02f19
parametrize test to test inlining different variables
TomNicholas May 3, 2024
1e46efb
raise when encoding encountered during serialization of numpy arrays
TomNicholas May 3, 2024
5071ae9
see if closing the netcdf files in the fixtures fixes the kerchunk error
TomNicholas May 3, 2024
654851c
update docs
TomNicholas May 3, 2024
f1e5eab
ensure inline_threshold is an int
TomNicholas May 6, 2024
98c4f8f
Merge branch 'main' into inline_chunkdata
TomNicholas May 8, 2024
6135e14
formatting
TomNicholas May 8, 2024
7970021
Merge branch 'main' into inline_chunkdata
TomNicholas May 15, 2024
ec7528c
Merge branch 'main' into inline_chunkdata
TomNicholas May 15, 2024
8c24af6
specified NetCDF4 for netcdf4_file fixture & added netcdf4 to pyproje…
norlandrhagen May 15, 2024
744cc6d
Merge branch 'main' into inline_chunkdata
TomNicholas May 16, 2024
f997b79
organize tests
TomNicholas May 16, 2024
0baee1a
dont unnecessarily slice dataset
TomNicholas May 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions virtualizarr/tests/test_integration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from virtualizarr import open_virtual_dataset

from pprint import pprint


def test_numpy_arrays_to_inlined_kerchunk_refs(netcdf4_file):
from kerchunk.hdf import SingleHdf5ToZarr

# test inlining all the variables
expected = SingleHdf5ToZarr(netcdf4_file, spec=1, inline_threshold=1e9).translate()
TomNicholas marked this conversation as resolved.
Show resolved Hide resolved

pprint(expected)

# loading all the variables should produce same result as inlining them all using kerchunk
vds = open_virtual_dataset(netcdf4_file, loadable_variables=['air', 'time', 'lat', 'lon'], indexes={})
refs = vds.virtualize.to_kerchunk(format='dict')

assert refs == expected
Loading