Skip to content

Commit

Permalink
Add doc and uncache function
Browse files Browse the repository at this point in the history
  • Loading branch information
martindurant committed Jan 23, 2024
1 parent c71ec5b commit cd511a5
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 0 deletions.
12 changes: 12 additions & 0 deletions kerchunk/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,15 @@
__version__ = "9999"

__all__ = ["__version__"]


def set_reference_filesystem_cachable(cachable=True):
"""While experimenting with kerchunk and referenceFS, it can be convenient to not cache FS instances
You may wish to call this function with ``False`` before any kerchunking session; leavnig
the instances cachable (the default) is what end-users will want, since it will be
more efficient.
"""
import fsspec

fsspec.get_filesystem_class("reference").cachable = cachable
32 changes: 32 additions & 0 deletions kerchunk/combine.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,38 @@ def append(
target_options=None,
**kwargs,
):
"""
Update an existing combined reference set with new references
There are two main usage patterns:
- if the input ``original_refs`` is JSON, the combine happens in memory and the
output should be written to JSON. This could then be optionally converted to parquet in a
separate step
- if ``original_refs`` is a lazy parquet reference set, then it will be amended in-place
If you want to extend JSON references and output to parquet, you must first convert to
parquet in the location you would like the final product to live.
The other arguments should be the same as they were at the creation of the original combined
reference set.
NOTE: if the original combine used a postprocess function, it may be that this process
functions, as the combine is done "before" postprocessing. Functions that only add information
(as as setting attrs) would be OK.
Parameters
----------
path: list of reference sets to add. If remote/target options would be different
to ``original_refs``, these can be as dicts or LazyReferenceMapper instances
original_refs: combined reference set to be extended
remote_protocol, remote_options, target_options: referring to ``original_refs```
kwargs: to MultiZarrToZarr
Returns
-------
MultiZarrToZarr
"""
import xarray as xr

fs = fsspec.filesystem(
Expand Down

0 comments on commit cd511a5

Please sign in to comment.