-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using zipped file, is there any reason , why the zipped file is not removed from the directoty once unzipped? #603
Comments
Hi, try "offset.attributes.string": "name+hash". |
In my opinion, this is an issue with lastModified, this parameter is used as a key by which the processed files are located, but the lastModified parameter is overwritten every time a message is sent to the "tasks.file.status.storage.topic" topic, so it thinks that it is a new file since the key is different. |
Thanks for your answers, but I am already aware that if I remove the lastModified within the config offset.attributes.string" will not produced this issue. However, my question if more about the possibity to remove the zipped file once unzipped. Is it something doable? |
ok, try |
Thanks for the suggestion, but I just tested it and the same issue is occuring, the csv file is deleted once processed, but since the zipped file is not removed, event with this configuration, the file that was zipped will be unzipped and processed again if the lastModified config is used with name in "offset.attributes.string" . This ends up to in an infinit loop as well. I think this is an issue when procssing zipped file, this file should be removed as as as it is unzipped. |
if you set the reader to process csv files, then yes, it will not delete the zip, it is not provided for. |
Thanks for letting me know. As I can see, the issue #225 was closed with no action taken. Maybe, it wasn't considered as an issue, but, in my opinion, it is one. Concening your comment, what you mean by this "if you set the reader to process csv files, then yes, it will not delete the zip, it is not provided for." ? Looking at the code, this behavior is for all supported file types when using LocalFSDirectoryListing and SftpFilesystemListing. |
Hello @fhussonnois Within the method: listEligibleFiles of the class: LocalFSDirectoryListing, wolud it be an option to add this code instead of this In this way, the compressed file will be removed once decompressed successfully, unless it is mandatory to keep the compressed file within the directory. |
Hi @rouellet99, it should be OK to remove the compressed file once uncompress. But this behavior should be enabled though a connector's config property to keep the current behavior if needed. Maybe we should add an property |
Hello @fhussonnois, for this improvement, do I create a new branch from master? |
Hi @rouellet99, yes you can create a branch from the master and push your pull request for review. Thank you very much :) |
Hello @fhussonnois, I had to create a fork since I didn't have the write permission on the repo. I hope that this is not an issue. The following PR has been created: #629. |
fix: (plugin): change the code for the configuration to delete the compressed file after extraction LocalFSDirectoryListing.java:[104,5] (metrics) CyclomaticComplexity: Cyclomatic Complexity is 16 (max allowed is 15). When using zipped file, is there any reason , why the compressed file is not removed from the directory once extracted? streamthoughts#603
fix: (plugin): change the code for the configuration to delete the compressed file after extraction LocalFSDirectoryListing.java:[104,5] (metrics) CyclomaticComplexity: Cyclomatic Complexity is 16 (max allowed is 15). When using zipped file, is there any reason , why the compressed file is not removed from the directory once extracted? #603
fix: (plugin): change the code for the configuration to delete the compressed file after extraction LocalFSDirectoryListing.java:[104,5] (metrics) CyclomaticComplexity: Cyclomatic Complexity is 16 (max allowed is 15). When using zipped file, is there any reason , why the compressed file is not removed from the directory once extracted? #603 695c6f2
FilePulse version 2.13, docker desktop 4.22.1
When using zipped file, is there any reason why the zipped file is not removed from the directoty once unzipped?
The reason of my question is when using
"offset.attributes.string": "name+lastModified",
"fs.cleanup.policy.class": "io.streamthoughts.kafka.connect.filepulse.fs.clean.LocalMoveCleanupPolicy",
since the zipped file is not removed, once the unzipped file has been processed, the file that was zipped will be unzipped and processed again since the lastModified value is different than the previous one. This will end up to in an infinit loop.
Is there a way to make sure that the zipped file is removed once it is unzipped to avoid this situation?
The text was updated successfully, but these errors were encountered: