Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Compression Streams w3c draft #143

Open
Rycochet opened this issue Feb 17, 2020 · 9 comments
Open

New Compression Streams w3c draft #143

Rycochet opened this issue Feb 17, 2020 · 9 comments
Labels

Comments

@Rycochet
Copy link
Collaborator

https://wicg.github.io/compression/

Compression Streams

Draft Community Group Report, 16 February 2020

The APIs specified in this specification are used to compress and decompress streams of data. They support "deflate" and "gzip" as compression algorithms. They are widely used in web developers.

Looks like we might be getting native support for some things - how can we leverage and / or make use of this (question for later as its just a draft) :-)

@pieroxy
Copy link
Owner

pieroxy commented Aug 13, 2021

It looks more like a replacement for lz-string if it gets widespread adoption.

@sukima
Copy link

sukima commented Aug 8, 2023

Looks like we now have wide spread support is there an advantage to lz-string over the built-in gzip + btoa now?

@Rycochet
Copy link
Collaborator Author

Rycochet commented Aug 8, 2023

Compatibility and choice - at some point it may be that this changes to being a wrapper for the new API's - but that would still give a known name for compatibility with other platforms :-)

@ZYinMD
Copy link

ZYinMD commented Sep 22, 2023

So has anyone tested it? (the new CompressionStream API)? It can use gzip or deflate to compress a string into ArrayBuffer, but how should I then covert the ArrayBuffer to the smallest string possible?

@karnthis
Copy link
Contributor

Direct translation would be difficult due to the sanitization process used for things like TextDecoder, so a custom function would be needed. This seems like something that would be interesting to explore once the final v2 is ready.

@ZYinMD
Copy link

ZYinMD commented Sep 27, 2023

TextDecoder wouldn't work easily, because the first 128 code points are 1 byte and 128 to 255 are 2 bytes in UTF-8, and UTF-8
is the only choice if you also want to use TextEncoder to reverse. In my test, it can't reliably convert from bytes to string and then reverse back.

example code:

function convertToStringThenBack(input: Uint8Array) {
  const string = new TextDecoder().decode(input.buffer);
  const back = new TextEncoder().encode(string);
  const isEqual =
    back.length === input.length &&
    back.every((value, index) => value === input[index]);
  if (isEqual) console.log("good");
  else console.log("bad");
}

convertToStringThenBack(new Uint8Array([1, 10, 100, 127])); // good
convertToStringThenBack(new Uint8Array([1, 10, 100, 128])); // bad

@karnthis
Copy link
Contributor

yeah I have run into the same. I will dig around for my function I use to maintain reliable transition

@ZYinMD
Copy link

ZYinMD commented Sep 27, 2023

One thing I've found is that base64 is actually quite good if you store text in the filesystem, because both browser and node will output UTF-8 when you ask it to write a text file to the file system, and since ASCII chars in UTF-8 is only 1 byte per char, it's quite efficient. (75% efficient to be precise, 1MB base64 string can store 0.75MB binary info).

Performant solutions already exist for conversion between array buffer and base64.

I have implemented string <-> gzip <-> base64 in my project and it's very fast.

However, if the string is to be stored in localStorage or a database, then base64 may not be efficient, depending on whether it's stored as UTF-8 or UTF-16, which I don't know.

@karnthis
Copy link
Contributor

It is important to note that this library was originally for browser uses and officially supports those. Node is a whole different beast, and from my research does not support utf-16 strings in any way. I believe there is a nodejs port for v1.4 out there already, and would prefer to keep them as separate implementations due to excessive complexity in trying to support both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants