Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and features JSON.stringify on Java objects #875

Merged
merged 8 commits into from
Apr 30, 2021

Conversation

tonygermano
Copy link
Contributor

@tonygermano tonygermano commented Apr 28, 2021

  • Nested Java containers (Maps, Collections, Object[]) now stringify properly with working circular reference checks.
  • Added property to Context object to set a converter for Java objects that are not a container. Defaults to returning the toString value. Can still be preempted by the presence of a toJSON method or replacer function.
  • Several basic implementations of converters are provided:
    • STRING - default converter that returns object.toString()
    • EMPTY_OBJECT - this would be similar to a javascript object that contains no enumerable properties
    • UNDEFINED - results in the key with this object as a value being excluded from the final JSON string
    • THROW_TYPE_ERROR - throws a TypeError specifying the type that could not be converted. Closest to current Rhino behavior.
    • BEAN - converts to a javascript object with the class type and a child object containing all of the readable properties, or null if no readable properties
  • Added static Context#javaToJS(Object value, Scriptable scope, Context cx) because the existing method did not allow passing a Context object, but always tried to get the current Context instead.

Closes #460
Closes #450

tonygermano and others added 5 commits April 24, 2021 15:53
Incorporated some changes from PR mozilla#824 for checking circular references.

Changed default behavior of java objects that are not treated as
containers to return the toString value by default rather than throwing
a TypeError.

Co-authored-by: Roland Praml <roland.praml@foconis.de>
...and basic implementations. NativeJSON will access the converter when
a NativeJavaObject does not represent any js types.
@tonygermano
Copy link
Contributor Author

This replaces PR #868 and PR #824
Closes #460

I have no idea why spotless failed this time. I already applied changes in one of the commits, and it's not flagging any issues on my side when I run spotlessCheck ☹️

@gbrail @rPraml Feedback would be appreciated on the javaToJSONConverter implementation. After rereading both of your previous comments I decided that a basic toString probably made the most sense as a default action other than throwing the TypeError (but throwing a TypeError is still an option.)

@rPraml I have not yet tried it, but I think this would let you plug in a method that calls Jackson for the converter as you mentioned would be your preferred solution in #824 (comment). See the BEAN implementation and test case to decide if I'm on the right track.

I didn't write a test for this, but I did try it in the shell, and it worked to pass a javascript arrow function to the Context as a javaToJSONConverter, as in:

cx.setJavaToJSONConverter(o => 'I found a ' + o.getClass().getName());

Copy link
Contributor

@rPraml rPraml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonygermano All in all this looks good to me. Maybe you think about the idea to support also other array types than Object[]

return JavaToJSONConverters.STRING;
}
return javaToJSONConverter;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the the only code change in this class. Is this correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second commit is the spotless reformat, and the third commit 7d0dd08 contains the actual changes to Context.

src/org/mozilla/javascript/NativeJSON.java Outdated Show resolved Hide resolved
src/org/mozilla/javascript/NativeJSON.java Outdated Show resolved Hide resolved
src/org/mozilla/javascript/NativeJSON.java Outdated Show resolved Hide resolved
Arrays.stream(elements)
.map(o -> Context.javaToJS(o, state.scope, state.cx))
.toArray();
value = state.cx.newArray(state.scope, elements);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what happens here: NativeJavaList will perform a copy of the internal elements...
I see, here is much data copied around, but I think this should not be a so big performance impact

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on removing the streams. I think this was the last one. They were easier for me to get things working as this is how I would do it in javascript.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream is now gone, and I refactored that area of the code. Please review.

tonygermano added a commit to tonygermano/rhino that referenced this pull request Apr 29, 2021
Incorporated suggestions from mozilla#875

Co-authored-by: Roland Praml <roland.praml@foconis.de>
Incorporated suggestions from mozilla#875

Co-authored-by: Roland Praml <roland.praml@foconis.de>
@tonygermano tonygermano force-pushed the json-stringify-java-nested2 branch from ca3d61e to acbd452 Compare April 29, 2021 00:33
@tonygermano
Copy link
Contributor Author

I just noticed a regression bug was introduced in #860 which is also resolved in this PR.

We don't currently run the test, but I ran it manually, found here:
https://github.com/tc39/test262/blob/70bc32edab22b44db9d671ce505db8842ae200b6/test/built-ins/JSON/stringify/replacer-array-order.js

The JSON.stringify tests in test262 have been completely rewritten since the commit we are currently using. How can those newer tests be brought in?

@rbri
Copy link
Collaborator

rbri commented Apr 29, 2021

regarding the test updates see also #817

Will try to work on this next month

@@ -365,10 +355,44 @@ private static String join(Collection<Object> objs, String delimiter) {
}

private static String jo(Scriptable value, StringifyState state) {
if (state.stack.search(value) != -1) {
throw ScriptRuntime.typeErrorById("msg.cyclic.value");
Object trackValue = value, unwrapped = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the optimization of passing a StringBuilder instead of returning as String. This should reduce expensive string concatenations in the for loop at the end of the method. (can also be done in a separate PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could potentially be another PR. I don't think it's related to this one. I think string concatenation is not actually that expensive in java 9+, so maybe it's better if we can just wait it out.

This says when classes are compiled for Java 9+ to use + instead of StringBuilder where possible (and even gives an example of doing it in a loop.)
https://dzone.com/articles/jdk-9jep-280-string-concatenations-will-never-be-t

Another article about it:
https://medium.com/javarevisited/java-compiler-optimization-for-string-concatenation-7f5237e5e6ed

I don't 100% understand how it works. It would maybe be interesting if someone wanted to profile the differences in the current implementation when compiled for Java 11 vs Java 8, as that may show there is no need to introduce a StringBuilder in the long term, while keeping the code easier to follow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that we don't need more to do, but we could apply the same technique to bytecode generated in Rhino, by replacing string concatenations with that same target: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/invoke/StringConcatFactory.html

Today Rhino uses ConsString internally and it trips up all kinds of people who write embedding code that expects to use String. Replacing ConsString with some clever code generation would be awesome.

@@ -500,6 +550,28 @@ private static String quote(String string) {
return product.toString();
}

private static Object javaToJSON(Object value, StringifyState state) {
value = state.cx.getJavaToJSONConverter().apply(value);
value = Context.javaToJS(value, state.scope, state.cx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think about the jackson use case.

What do you think is the best method to use jackson?

  1. There is a ObjectMapper.convertValue(bean, Map.class) https://fasterxml.github.io/jackson-databind/javadoc/2.8/com/fasterxml/jackson/databind/ObjectMapper.html#convertValue(java.lang.Object,%20java.lang.Class) which seems to be useful for this task but may produce a lot of Map/List instances.

  2. Using writeValueAsString() will be more efficient, but the returned value will be currently treated as normal string and gets quoted again, so should we add the ability to return a raw json?

This could be done by adding a simple transport-class

public class RawJson {
   private String json;
   public RawJson(String json) { this.json = json }
}

the converter then would look like UnaryOperator<Object> JACKSON v -> new RawJson(objectMapper.writeValueAsString(v)) and here we add an "instanceOf" check, if it is already a raw json.

Pro: this will be probably more performant and do not need to instanttiagea huge amount of List/Maps during conversions

Con: you loose the ability to use replacer or indention options of JSON.stringify.

On the other hand you may call objectMapper.writeValueAsString also from the javascript side (but jackson may not be able to deserialize JavaScript objects)

Copy link
Contributor Author

@tonygermano tonygermano Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectMapper.convertValue(bean, Map.class) is what I had in mind.

I had thought about passing the state along with the object to the converter, but then decided against it to keep it simple. By having the return value stringified, that allowed for:

  • js implementations to return js objects
  • java implementations to return Maps/Collections/Arrays for objects containing other objects
  • stringify handles recursion instead of the converter
  • Scriptables contained in java objects to be stringified (or not for functions, symbols, etc..) as the correct js type
  • returning Undefined.instance to explicitly drop the property containing this object from the final output
  • respecting the current stringify state (replacer and indent)
  • strings to be automatically quoted and escaped

Would it be worth seeing if the current behavior is a performance issue before introducing RawJson in another PR?

Only introducing RawJson I think would be tolerable with its shortcomings since it would not be used by the default implementation, and would only add an additional type check. An implementation could also output something that isn't actually valid JSON with this option.

Getting it to play nice with the replacer and indent would be more difficult.

  • The StringifyState would need to be passed to the converter. It should probably be a clone of the actual state so that the converter doesn't corrupt it. This would also lead to the creation of a lot of objects, even though the provided implementations would not use them.
  • StringifyState would need to be made public and shouldn't be an inner class.
  • The converter type would need to be changed from a UnaryOperator to a BiFunction to support passing the additional argument.
  • The converter itself would need to implement handling of the replacer and pretty printing.

Copy link
Contributor

@rPraml rPraml Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringifyState could implement an interface which exposes only getters. This should avoid corruption (if no reflection is used) and no new object has to be created.

Maybe you can do some small performance tests if it is worth

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringifyState has properties that are not immutable, though. I think I'd rather not go down that path, as it complicates the implementation to get a potential performance boost for an edge case, which isn't even possible right now (stringifying a high volume of java objects using a custom implementation.)

Is it possible to attack this from the other end and teach Jackson how to handle Scriptables?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to implement immutable getters, e.g. StringifyState.getPropertyList() could return Collections.unmodifiableList(...).

But all in all, I agree, that this edge case would add lot of complexity.
Currently, I see the following options:

  • Use JSON.stringify with Jackson's convertValue. This will work in all cases and will support all stringify-options (like replacer, indention options...) but adds some overhead.
  • Use Jackson directly. This gives you the best performance. You may have to write special JsonSerializer for some JS-objects. (NativeObject/NativeArray should already work out of the box, because they implement Map/List)
  • You can use RawJson. (if implemented)

Would it be worth seeing if the current behavior is a performance issue before introducing RawJson in another PR?

Only introducing RawJson I think would be tolerable with its shortcomings since it would not be used by the default implementation, and would only add an additional type check. An implementation could also output something that isn't actually valid JSON with this option.

If you have time, you can do some measurements, what the impact of convertValue vs writeValueAsString is.
If we really see, that it might be a performance issue, we can implement RawJson (without passing the state) and document the shortcomes. But this can be done also in a later PR.

From my point of view everything is discussed and considered. I would accept this PR now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Rhino is used in so many places, and version management has become so complex that depending on Jacskon in the core of Rhino would be hard for a lot of people to adopt.

However, if we created another package from this project, like, say "rhino-jackson-adapter", and by putting that JAR in the classpath it causes Rhino to use Jackson instead for serialization, then people who want better performance and can change their build rules can just use that. I have always been a fan of Java's "ServiceLoader" pattern for things like this but there might be better ways today.

The conversation about string concatenation is also fascinating -- I have been so indoctrinated to never concatenate strings in loops using + because of all the garbage that gets created!

@gbrail
Copy link
Collaborator

gbrail commented Apr 30, 2021

I think this is very useful and cleverly implemented. Thanks!

@gbrail gbrail merged commit 0c05d30 into mozilla:master Apr 30, 2021
gbrail pushed a commit that referenced this pull request Apr 30, 2021
Incorporated suggestions from #875

Co-authored-by: Roland Praml <roland.praml@foconis.de>
@tonygermano tonygermano deleted the json-stringify-java-nested2 branch April 30, 2021 18:02
@p-bakker p-bakker added bug Issues considered a bug feature Issues considered a new feature labels Oct 13, 2021
@p-bakker p-bakker added this to the Release 1.7.14 milestone Oct 13, 2021
@p-bakker p-bakker added the Java Interop Issues related to the interaction between Java and JavaScript label Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues considered a bug feature Issues considered a new feature Java Interop Issues related to the interaction between Java and JavaScript
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Circular JSON Endless loop and crush when using Stringify on NativeObject
5 participants