-
Notifications
You must be signed in to change notification settings - Fork 88
Open
Description
Request
The current CLP package flow doesn't handle the default dataset properly.
- In the compression end, we set the dataset to
"default"if not set:dataset = CLP_DEFAULT_DATASET_NAME if dataset is None else dataset dataset = CLP_DEFAULT_DATASET_NAME if dataset is None else dataset
- In the native compression script, it can take an optional dataset without further checking:
args_parser.add_argument( - In the compression job config, the dataset is also nullable:
dataset: str | None = None dataset: str | None = None
- In the compression job executor, the dataset is nullable but it is not checked:
clp/components/job-orchestration/job_orchestration/executor/compress/compression_task.py
Line 399 in bfc474f
archive_output_dir = archive_output_dir / dataset
The flow works well if everything's submitted from the compression script end, as it ensures the dataset wil always be default. However, this doesn't work if we submit compression jobs directly to CLP DB.
We should make the dataset handling consistent with the config definition.
Possible implementation
- Allow the dataset to be nullable.
- Don't set it to
defaultin the compression script. - Handle the dataset in the compression job.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request