-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy_files should have an option to skip files if they already exist #17
Comments
Could make sense to use bash's rsync command here? import subprocess
# copy files using rsync in order to only copy new files and not old ones (
# this spares us time as we are avoiding unnecessary overwriting)
bashCommand = f"rsync -av --exclude .* --copy-links {src_dataset_path}/ {dst_dataset_path}"
process = subprocess.Popen(bashCommand.split(), stdout=subprocess.PIPE)
output, error = process.communicate() |
But rsync is only available on Linux? This might help? |
Should be possible to sync files with This would be the code for sysrsync: for file,dst_dir in zip(dti_df['filepath'],dti_df['dst_dir']):
dst_dir = dst_dir + '//'
sysrsync.run(source=file,
destination=dst_dir,
options=['-a','--mkpath'],
sync_source_contents=False) Important: sysrsync removes the trailing slash of the dst_dir by default, but we need that in order to sync a file to a folder. The expression would be:
|
Use case: I use copy_files mostly for copying files from a server to my local pc. Since it often happens that the connection gets lost, I have to restart the copying multiple times. In this case, it would be nice if copy_files would have an option to check if the file already exists (+ optionally checking if it's not corrupted) and only copy files that haven't already been copied.
The text was updated successfully, but these errors were encountered: