A slimmed-down S3 copy utility similar to rclone, designed for efficient file copying between S3-compatible storage systems.
- rclone compatibility: Reads credentials from
~/.config/rclone/rclone.conf - Same syntax: Uses
config:bucket/pathsyntax like rclone - Intelligent copying: Compares ETag and size to avoid unnecessary transfers
- Multipart ETag handling: Computes SHA1 hash for multipart uploads when needed
- Concurrent workers: Configurable number of copy workers for performance
- Progress tracking: Real-time progress bar showing copied/skipped/failed counts and current bandwidth
- File list support: Copy specific files using
--files-fromparameter - Verification: Optional post-copy verification by downloading and comparing SHA1 hashes
go build -o go-copy .go-copy reads S3 credentials from your existing rclone configuration file at ~/.config/rclone/rclone.conf.
Example rclone.conf:
[myremote]
type = s3
access_key_id = AKIAIOSFODNN7EXAMPLE
secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
region = us-west-2
endpoint = https://s3.amazonaws.com
no_check_certificate = false
[minio]
type = s3
access_key_id = minioadmin
secret_access_key = minioadmin
endpoint = http://localhost:9000
region = us-east-1
no_check_certificate = truego-copy [flags] <source> <target>--files-from <file>: Read list of files to copy from specified file (required)--worker-count <n>: Number of concurrent copy workers (default: 4)--verbose, -v: Enable verbose output--verify: Verify copied files by downloading and comparing SHA1 hashes
# Copy files listed in files.txt from source to target
go-copy --files-from files.txt myremote:mybucket/source/ myremote:mybucket/target/
# Use 10 workers for faster copying
go-copy --worker-count 10 --files-from list.txt remote1:bucket1/ remote2:bucket2/
# Verbose output for debugging
go-copy -v --files-from files.txt myremote:source-bucket/ myremote:target-bucket/
# Verify copied files by comparing SHA1 hashes
go-copy --verify --files-from files.txt myremote:source/ myremote:target/The --files-from file should contain one file path per line:
file1.txt
path/to/file2.txt
deep/directory/file3.txt
# Comments are supported
another-file.txt
The progress bar displays comprehensive real-time information in this format:
C:15 S:3 F:0 2.5 MiB/s 45.2 MB/1.2 GB ETA:5m30s |████████████████████████████████████████| (18/100, 18 it/s)
Where:
- C:15 - 15 files copied successfully
- S:3 - 3 files skipped (already up-to-date)
- F:0 - 0 files failed
- 2.5 MiB/s - Current transfer speed (adaptive units: B/s, KiB/s, MiB/s, GiB/s)
- 45.2 MB/1.2 GB - Data transferred / Total data size (adaptive units: B, KB, MB, GB, TB)
- ETA:5m30s - Estimated time to completion (adaptive format: seconds, minutes, hours, days)
- (18/100, 18 it/s) - 18 out of 100 files processed, 18 files per second
Note: The application calculates total data size using concurrent HEAD requests before starting the copy operation. This provides accurate progress tracking and ETA calculations. For large file lists (millions of files), this initial calculation shows its own progress bar.
- Configuration Loading: Reads S3 credentials from rclone.conf
- File List Parsing: Loads the list of files to copy from the specified file
- Object Comparison: For each file, compares source and target objects:
- Compares file size first (quick check)
- Compares ETag if both objects exist
- If source ETag contains a hyphen (multipart upload), downloads both objects and compares SHA1 hashes
- Concurrent Copying: Uses a worker pool to copy files concurrently
- Progress Tracking: Shows real-time progress with detailed statistics:
- C:X - Files copied successfully
- S:X - Files skipped (already up-to-date)
- F:X - Files failed to copy
- Current bandwidth in KiB/s, MiB/s, or GiB/s
- Summary Report: Displays final statistics including files copied, skipped, and failed
- Optional Verification: When
--verifyis used, downloads both source and target objects and compares SHA1 hashes
- Size mismatch: File is copied
- Target doesn't exist: File is copied
- ETag match (non-multipart): File is skipped
- ETag mismatch (non-multipart): File is copied
- Multipart ETag: Downloads both files, computes SHA1, compares hashes
When the --verify flag is used, go-copy performs an additional verification step after copying:
- Post-Copy Verification: After all copy operations complete, verification runs automatically
- Complete Download: Both source and target objects are downloaded in full
- SHA1 Comparison: SHA1 hashes are computed for both objects and compared
- Separate Progress Bar: Verification shows its own progress bar with statistics
- Detailed Results: Verification failures are reported with specific hash mismatches
- Performance Impact: Verification requires downloading all data twice, significantly increasing transfer time
When to Use Verification:
- Critical data transfers where integrity is paramount
- Suspected network issues or unreliable connections
- Compliance requirements that mandate data verification
- One-time migrations where you want absolute certainty
- Concurrent operations: Uses configurable worker pools for all operations (copy, verify, size calculation)
- Intelligent comparison: Avoids unnecessary transfers through ETag/SHA1 comparison
- Memory efficient: Streams file content without storing in memory
- Scalable: Tested with millions of files (6GiB memory usage for 15M files is acceptable)
- Fast size calculation: Uses concurrent HEAD requests with progress tracking
- Optimized for large datasets: Minimum 8 workers for size calculation, graceful error handling
- Comprehensive error reporting with context
- Failed files are tracked and reported in the summary
- Non-zero exit code if any files fail to copy
- Verbose mode provides detailed error information
You may see this warning during operation:
SDK 2025/10/05 17:31:02 WARN Response has no supported checksum. Not validating response payload.
This is a harmless warning from the AWS SDK indicating that the S3 response doesn't include checksum headers for payload validation. This commonly occurs with:
- Older S3-compatible storage systems (MinIO, Ceph, etc.)
- Objects uploaded without checksum metadata
- Custom S3 implementations
This warning has been automatically suppressed in go-copy v1.1+ - the application filters out these specific AWS SDK warnings while preserving other important log messages. Your data is still being transferred correctly and our application performs its own integrity checks using ETag comparison and SHA1 hashing when needed.
- Does not support multipart uploads (files are copied as single objects)
- No resume capability for interrupted transfers
- Limited to S3-compatible storage systems
- Requires existing rclone configuration
Copyright 2025 Data Direct Networks
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.