Skip to content

addData requires that data remains valid until close #152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
flomnes opened this issue Jul 17, 2022 · 5 comments
Open

addData requires that data remains valid until close #152

flomnes opened this issue Jul 17, 2022 · 5 comments

Comments

@flomnes
Copy link
Contributor

flomnes commented Jul 17, 2022

Short description

When adding some data through ZipArchive::addData, the buffer read/file write is delayed until ZipArchive::close is called. This is a problem if the user wants to write the data immediately and free the underlying memory.

I couldn't find any workaround.

Example from my project

    // Add data to an existing ZipArchive
    filename.clear() << folder << SEP << "areas.txt";
    study.pZipArchive->addData(filename.c_str(),
                               out.c_str(),
                               out.size());
    // out is freed
    // Do some more stuff not related to out & libzippp
    pZipArchive->close(); // <= Valgrind indicated an invalid read, with some garbage characters in the corresponding entry
@flomnes
Copy link
Contributor Author

flomnes commented Jul 17, 2022

I found a workaround, adding close() + open() to force the flush. Am I doing it right ? If so, this mechanism could be encapsulated into a new flush function.

@ctabin
Copy link
Owner

ctabin commented Jul 25, 2022

Hi @flomnes,

Unfortunately it is how the underlying libzip library works: the data is written when the zip is closed, which is reflected through libzippp. I don't see any problem with closing/reopening the ZipArchive, however I'll consider to add a flush method.

@flomnes
Copy link
Contributor Author

flomnes commented Jul 26, 2022

@ctabin The problem is that in order to keep a valid archive, when adding data to an existing archive libzip creates a copy, writes to that copy and replaces the original. It operates that way to ensure that in case of error, the original will not be corrupted.

If the original is 5Gb and you want to add a few files, it becomes very expensive in terms of disk I/O. I haven't found a way to disable it. On the other hand, minizip-ng seems to write files immediately to the existing archive.

@Xiangze-Li
Copy link
Contributor

@flomnes I met this same problem in my work, where total size may reach 10 GB making the fake flush unaccecpable. A possible workaround I adapted is writing that data to a temp file on disk, then call the addFile with string filename param.

@dov
Copy link

dov commented Sep 17, 2024

I also got burnt on this. My suggestion to solve this is to add a method that takes the data either as unique_ptr<> or shared_ptr<> and then let libzippp take over the ownership of the data.

Something like:

using BinString = std::basic_string<uint8_t>;
bool addData(const std::string& entryName,
             std::shared_ptr<BinString> data);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants