Skip to content

Pushing fails on S3-like storage #142

@asiron

Description

@asiron

We are trying to use an S3-like bucket provided by our cluster provider. However, they don't disclose the actual back-end if it's MinIO or other
I can create a bucket with s3cmd or aws s3.

I can upload a file using s3cmd

asiron@glados ~/code/datasets$ s3cmd put test s3://my-data-bucket                                                                            64 ↵  ✹ ✭feature/11-migrate-data-to-s3-like-storage-on-cluster 
upload: 'test' -> 's3://my-data-bucket/test'  [1 of 1]
 11 of 11   100% in    0s    68.98 B/s  done

But I can't upload using aws cli

asiron@glados ~/code/datasets$ AWS_RESPONSE_CHECKSUM_VALIDATION=WHEN_REQUIRED aws s3 --endpoint-url https://<url> cp test s3://my-data-bucket
upload failed: ./test to s3://my-data-bucket/test An error occurred (InvalidArgument) when calling the PutObject operation: x-amz-content-sha256 must be UNSIGNED-PAYLOAD, STREAMING-AWS4-HMAC-SHA256-PAYLOAD, or a valid sha256 value.

Then when I try to push data using dvc:

mzurad@glados ~/code/datasets$ dvc push -r my-remote <dataset>/train/train-000000.tar                                         130 ↵  ✹ ✭feature/11-migrate-data-to-s3-like-storage-on-cluster 
Collecting                                                                                                                                                                        |0.00 [00:00,    ?entry/s]
ERROR: failed to transfer '39d34603bdde7e44376ef22b16a8780b' - [Errno 22] x-amz-content-sha256 must be UNSIGNED-PAYLOAD, STREAMING-AWS4-HMAC-SHA256-PAYLOAD, or a valid sha256 value.: An error occurred (InvalidArgument) when calling the UploadPart operation: x-amz-content-sha256 must be UNSIGNED-PAYLOAD, STREAMING-AWS4-HMAC-SHA256-PAYLOAD, or a valid sha256 value.                                           
Pushing                                                                                                                                                                                                     
ERROR: failed to push data to the cloud - 2 files failed to upload

It seems that it could be related to aws/aws-cli#9214 However, exporting AWS_RESPONSE_CHECKSUM_VALIDATION=WHEN_REQUIRED as suggested in one of the comments does not work for me.

dvc doctor:

DVC version: 3.63.0 (pip)
-------------------------
Platform: Python 3.12.3 on Linux-6.14.0-27-generic-x86_64-with-glibc2.39
Subprojects:
        dvc_data = 3.16.12
        dvc_objects = 5.1.1
        dvc_render = 1.0.2
        dvc_task = 0.40.2
        scmrepo = 3.5.2
Supports:
        azure (adlfs = 2025.8.0, knack = 0.12.0, azure-identity = 1.24.0),
        gdrive (pydrive2 = 1.21.3),
        gs (gcsfs = 2025.9.0),
        hdfs (fsspec = 2025.9.0, pyarrow = 21.0.0),
        http (aiohttp = 3.12.15, aiohttp-retry = 2.9.1),
        https (aiohttp = 3.12.15, aiohttp-retry = 2.9.1),
        oss (ossfs = 2025.5.0),
        s3 (s3fs = 2025.9.0, boto3 = 1.40.18),
        ssh (sshfs = 2025.2.0),
        webdav (webdav4 = 0.10.0),
        webdavs (webdav4 = 0.10.0),
        webhdfs (fsspec = 2025.9.0)
Config:
        Global: /home/mzurad/.config/dvc
        System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/encrypted-home
Caches: local
Remotes: local, ssh, s3
Workspace directory: ext4 on /dev/mapper/encrypted-home
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/fe53c9c0024ad6b00116f7d16b01c0a3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions