Data kale is a simple data lake intended to abstract away an S3 compatible backend like Wasabi.
The access key to the s3 backend
The access secret to the s3 backend
The location of the local directory that will contain all the repositories directories.
Ex. data.root
is set to /data
then the data for the repository namespace-puddle
will end up in /data/namespace-puddle
.
Note: Just as pathlib.Path(.)
, the default is to be relative to home directory, i.e. data
is ~/data
.
It supports expanduser so it's easier to just be explicit with ~/data
.
~/.kale.toml
:
[credentials]
s3-access-key = "ACCESS"
s3-secret-key = "SECRET"
[data]
root = "~/data"
virtualenv venv
source venv/bin/activate
pip install -e .
python -m data_kale.download namespace-puddle
python -m data_kale.upload namespace-puddle
python -m data_kale.list_remote
source venv/bin/activate
python setup.py test
source venv/bin/activate
pytest