Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compression to the metadata code snapshot #11470

Merged
merged 8 commits into from
Nov 5, 2024

Conversation

4e6
Copy link
Contributor

@4e6 4e6 commented Nov 1, 2024

Pull Request Description

close #11420

Changelog:

  • update: add zlib compression to the snapshot metadata field
  • add: implement nodejs zlib for polyglot ydoc-server
  • add: implement nodejs Buffer for polyglot ydoc-server

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

@4e6 4e6 added the CI: No changelog needed Do not require a changelog entry for this PR. label Nov 1, 2024
@4e6 4e6 self-assigned this Nov 1, 2024
'node:zlib': {
varName: 'zlib',
type: 'cjs',
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make zlib a global symbol in the bundle instead of require('node:zlib')

@@ -505,6 +513,7 @@ class ModulePersistence extends ObservableV2<{ removed: () => void }> {
const newSnapshot = newCode && {
snapshot: ModulePersistence.encodeCodeSnapshot(newCode),
}
if (newMetadata) newMetadata.snapshot = this.syncedMeta.ide.snapshot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #11435 changes the newMetadata argument to store the ide object:

-    newMetadata: fileFormat.IdeMetadata['node'] | undefined,
+    newMetadata: fileFormat.IdeMetadata | undefined,

This change https://github.com/enso-org/enso/pull/11435/files#diff-f621895c66ac2e5e220af1fb6ce9cd3694e5f85897f471300ad63a7c9f5bee9aL511-L513 builds the new metadata object without the snapshot field if there are no code changes. I.e. it makes the snapshot field disappear from the file on metadata changes (moving the node, for example)

widget: z.optional(z.record(z.string().uuid(), z.record(z.string(), z.unknown()))),
snapshot: z.string().optional(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the snapshot the last field. The ydoc diff algorithm places the snapshot at the end of the metadata. This prevents unnecessary edits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's subtle--it could use a comment.

Comment on lines 489 to 502
private static encodeCodeSnapshot(code: string): string | undefined {
try {
return zlib.deflateSync(Buffer.from(code, 'utf8')).toString('base64')
} catch {
return
}
}

private static decodeCodeSnapshot(snapshot: string): string {
return Base64.decode(snapshot)
private static decodeCodeSnapshot(snapshot: string): string | undefined {
try {
return zlib.inflateSync(Buffer.from(snapshot, 'base64')).toString('utf8')
} catch {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any errors here are worth logging

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to use this PR as an opportunity to restate that #11390 seems to miss two essential changes:

The change of the snapshot field content format done in this PR shows how essential versioning is. It should become part of the format asap, imo.


context.getBindings("js").putMember("TEXT", TEXT);

var result = CompletableFuture.supplyAsync(() -> context.eval("js", code), executor).get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than playing with global variables you should define a JavaScript function:

var code = "(function(text) { return Buffer.from(text).toString(); })";

and then invoke it:

var fn = context.eval("js", code);
return fn.execute(TEXT);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraalJS sets up tests this way https://github.com/search?q=repo%3Aoracle%2Fgraaljs%20putMember&type=code. I think it's cleaner than writing a function.

Charset charset;
try {
charset = Charset.forName(encoding);
} catch (IllegalArgumentException ignored) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we should log this at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it compatible with the nodejs by returning an error.

Charset charset;
try {
charset = Charset.forName(encoding);
} catch (IllegalArgumentException ignored) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above


return switch (command) {
case BUFFER_FROM -> {
final var text = arguments[1].asString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we can easily get IndexOutOfBounds exceptions when not checking the size of the arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is called from zlib.js and IndexOutOfBounds exception would indicate an error in the implementation. Plus these paths are checked in tests.

@4e6
Copy link
Contributor Author

4e6 commented Nov 4, 2024

Metadata versioning will be implemented in #11479

@4e6 4e6 added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Nov 4, 2024
@4e6 4e6 added the CI: Ready to merge This PR is eligible for automatic merge label Nov 5, 2024
@mergify mergify bot merged commit 47943a2 into develop Nov 5, 2024
41 of 42 checks passed
@mergify mergify bot deleted the wip/db/11420-add-compression-to-code-snapshot branch November 5, 2024 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add compression to the metadata code snapshot
4 participants