Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds lazy_reference_mapper_kwargs to refs_to_dataframe #511

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions kerchunk/df.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import fsspec
import zarr

from typing import Any, Dict

# example from preffs's README'
df = pd.DataFrame(
Expand Down Expand Up @@ -106,6 +107,7 @@ def refs_to_dataframe(
storage_options=None,
record_size=100_000,
categorical_threshold=10,
lazy_reference_mapper_kwargs: Dict[str, Any] = {},
):
"""Write references as a parquet files store.

Expand Down Expand Up @@ -134,6 +136,8 @@ def refs_to_dataframe(
Encode urls as pandas.Categorical to reduce memory footprint if the ratio
of the number of unique urls to total number of refs for each variable
is greater than or equal to this number. (default 10)
lazy_reference_mapper_kwargs : Dict[str, Any]
Optional kwargs to pass into LazyReferenceMapper
"""
from fsspec.implementations.reference import LazyReferenceMapper

Expand All @@ -156,6 +160,7 @@ def refs_to_dataframe(
root=url,
fs=fs,
categorical_threshold=categorical_threshold,
**lazy_reference_mapper_kwargs,
)

for k in sorted(refs):
Expand Down
Loading