Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qute Character Escapes for Json Content #43369

Open
ogomezdi opened this issue Sep 18, 2024 · 11 comments
Open

Qute Character Escapes for Json Content #43369

ogomezdi opened this issue Sep 18, 2024 · 11 comments
Labels
area/qute The template engine kind/enhancement New feature or request

Comments

@ogomezdi
Copy link

ogomezdi commented Sep 18, 2024

Description

It'll nice to have the same behavior for Json template for character escapes like the one we have for XML content explained here https://quarkus.io/guides/qute-reference#character-escapes allowing users to escape content we want to include, for example, as string in an attribute value, for those values that contain quotes, new line, ... characters.
Despite there is a workaround by creating a @TemplateExtension or @templatedata to access encode static functions available in Apache Commons Texts and others, but I think this must be a core functionality.

Implementation ideas

Use the same approach for escaping XML content, may be including in quarkus.qute.escape-content-types property "application/json" joined to the characters escape policy.
Other easy approach will be include in "Built-in Template Extensions" one for Strings, allowing encoding and unencoding options.

@ogomezdi ogomezdi added the kind/enhancement New feature or request label Sep 18, 2024
@quarkus-bot quarkus-bot bot added the area/qute The template engine label Sep 18, 2024
@quarkus-bot
Copy link

quarkus-bot bot commented Sep 18, 2024

/cc @mkouba (qute)

@mkouba
Copy link
Contributor

mkouba commented Sep 18, 2024

Escaping is implemented as a io.quarkus.qute.ResultMapper that is applied to all output expressions in a template before the evaluated string value is written to the output. See also io.quarkus.qute.HtmlEscaper. You can register a custom ResultMapper with the EngineBuilder#addResultMapper(ResultMapper) method.

However, if you need more context (string in an attribute value) you can't apply the escaping globally...

Despite there is a workaround by creating a @TemplateExtension or @templatedata to access encode static functions available in Apache Commons Texts and others, but I think this must be a core functionality.

I'm not quite sure this should be a core functionality.

Use the same approach for escaping XML content, may be including in quarkus.qute.escape-content-types property "application/json" joined to the characters escape policy.

If you want to apply the default escaping to json files then just add application/json to the list of the media types in the quarkus.qute.escape-content-types config property 🤷 . Keep in mind that Qute has to be able to detect the content-type of a template file (which is not always the case; depends on use case).

@ogomezdi
Copy link
Author

Let me try to explain better.
Nowadays Character Escapes only apply to content included in a HTML or XML template, but if we use a template to generate a Json output, and we want to assign string values to a key in the Json and the values include quotes, newlines, ... the resulting output will not create a valid json.
For example you have a template like this:
{ "key": "{variable}" }
And you bind variable which has a doble quote inside, imagine hello "world". You'll have this output:
{ "key": "hello "world"" }
Which is not a valid json, that must be:
{ "key": "hello \"world\"" }
I hope the explanation helps understanding the issue.

@mkouba
Copy link
Contributor

mkouba commented Sep 19, 2024

Yes, I do understand your example ;-)

My point is that you probably can't apply a global escaping to a JSON template because you need some "context" in order to apply the correct escaping. For example, in a template like { "key": "{variable}", "key2": {anotherVar} } where variable=hello "world" and anotherVar="hello world" you can't simply escape all double quotes in all output expressions, right?

So a template extension that would escape quotes is IMO the way to go: { "key": "{variable.asJsonValue}", "key2": {anotherVar} }:

@TemplateExtension
class MyExtensions {

    static String asJsonValue(String val) { 
       // escape quotes etc.
    }
}

@ogomezdi
Copy link
Author

This is the workaround I've mentioned in the issue description. But, What I'm specting is to have something similar to the approach applied for XML, I mean https://quarkus.io/guides/qute-reference#character-escapes where the variable content is automatically escaped in case we're in a XML/HTML template, so if you use it in this, portion of template <tag>{variable}</tag>, the qute engine automatically apply de escape XML rules to variable without needing to execute any method or extension explicitly. So for me translating this behavior when users try to obtain a JSON output from qute, also including the raw option if possible :-), is the aim of this request.

@mkouba
Copy link
Contributor

mkouba commented Sep 19, 2024

I think that I understand your use case but the question is: should we apply escaping for the whole JSON template? If so, what would be the rules? For XML/HTML it's simple. For JSON I'm not so sure...

@ogomezdi
Copy link
Author

I'm not sure to understand what do you mean when talking about the "whole JSON template", if you mean in any instance of variables to render in a template, my answer will be yes, as users will be able to apply .raw in places where they need to avoid this behavior.
Regarding the rules to apply may be we can rely on other implementations like https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/StringEscapeUtils.html#escapeJson(java.lang.String)

@mkouba
Copy link
Contributor

mkouba commented Sep 20, 2024

I'm not sure to understand what do you mean when talking about the "whole JSON template", if you mean in any instance of variables to render in a template, my answer will be yes, as users will be able to apply .raw in places where they need to avoid this behavior. Regarding the rules to apply may be we can rely on other implementations like https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/StringEscapeUtils.html#escapeJson(java.lang.String)

In that case, somethig like this should work:

public class JsonEscaper implements ResultMapper {

    @Override
    public boolean appliesTo(Origin origin, Object result) {
        if (result instanceof RawString) {
            return false;
        }
        Optional<Variant> variant = origin.getVariant();
        if (variant.isPresent()) {
            return variant.getContentType().startsWith("application/json");
        }
        return false;
    }

    @Override
    public String map(Object result, Expression expression) {
        return StringEscapeUtils.html.escapeJson(result.toString());
    }

and then register the mapper:

class MyBean {

    void configureEngine(@Observes EngineBuilder builder) {
        builder.addResultMapper(new JsonEscaper());
    }
}

I don't think we want to add commons-text as a dependency. However, we could implement a simple implementation similar to io.quarkus.qute.HtmlEscaper.

Feel free to send a pull request ;-).

@ogomezdi
Copy link
Author

ogomezdi commented Sep 20, 2024

I've found a easier implementation for encoding in other library that can fit in the way you've mentioned, may be something like this:

public static String escapeJson(String value) {
    if (value == null)
      return "";

    StringBuilder b = new StringBuilder();
    for (char c : value.toCharArray()) {
      if (c == '\r')
        b.append("\\r");
      else if (c == '\n')
        b.append("\\n");
      else if (c == '\t')
        b.append("\\t");
      else if (c == '"')
        b.append("\\\"");
      else if (c == '\\')
        b.append("\\\\");
      else if (c == ' ')
        b.append(" ");
      else if (isWhitespace(c)) {
        b.append("\\u"+JsonEscaper.padLeft(Integer.toHexString(c), '0', 4));
      } else if (((int) c) < 32)
        b.append("\\u" + JsonEscaper.padLeft(Integer.toHexString(c), '0', 4));
      else
        b.append(c);
    }
    return b.toString();
  }
  public static boolean isWhitespace(int ch) {
    return JsonEscaper.existsInList(ch, '\u0009', '\n', '\u000B','\u000C','\r','\u0020','\u0085','\u00A0',
        '\u1680','\u2000','\u2001','\u2002','\u2003','\u2004','\u2005','\u2006','\u2007','\u2008','\u2009','\u200A',
        '\u2028', '\u2029', '\u202F', '\u205F', '\u3000');
  }
  public static boolean existsInList(int value, int... array) {
    for (int i : array)
      if (value == i)
        return true;
    return false;
  }
  public static String padLeft(String src, char c, int len) {
    StringBuilder s = new StringBuilder();
    for (int i = 0; i < len - src.length(); i++)
      s.append(c);
    s.append(src);
    return s.toString();
  }

One more question, will this mapper be included in the default engine builder like io.quarkus.qute.HtmlEscaper in the io.quarkus.qute.runtime.EngineProducer constructor?

And finally, I think there is an error in the code you've proposed return variant.getContentType().startsWith("application/json") must be return variant.get().getContentType().startsWith("application/json")

@mkouba
Copy link
Contributor

mkouba commented Sep 20, 2024

Looks good.

isWhitespace(int ch) could be probably replaced with java.lang.Character.isWhitespace(int)?

A perf tip - if possible use the initial capacity for the StringBuilder (e.g. src.lenght() + len in padLeft()).

One more question, will this mapper be included in the default engine builder like io.quarkus.qute.HtmlEscaper in the io.quarkus.qute.runtime.EngineProducer constructor?

It depends. I mean if you register it with builder.addResultMapper(new JsonEscaper()) then it will live alongside the HtmlEscaper.

And finally, I think there is an error in the code you've proposed return variant.getContentType().startsWith("application/json") must be return variant.get().getContentType().startsWith("application/json")

👍

@ogomezdi
Copy link
Author

ogomezdi commented Sep 20, 2024

A perf tip - if possible use the initial capacity for the StringBuilder (e.g. src.lenght() + len in padLeft()).

May be easy for padLeft() use -> String.format("%4s", Integer.toHexString(c)).replace(' ','0')
So final code should be:

public static String escapeJson(String value) {
    if (value == null)
      return "";

    StringBuilder b = new StringBuilder();
    for (char c : value.toCharArray()) {
      if (c == '\r')
        b.append("\\r");
      else if (c == '\n')
        b.append("\\n");
      else if (c == '\t')
        b.append("\\t");
      else if (c == '"')
        b.append("\\\"");
      else if (c == '\\')
        b.append("\\\\");
      else if (c == ' ')
        b.append(" ");
      else if (java.lang.Character.isWhitespace(c)) {
        b.append("\\u"+String.format("%4s", Integer.toHexString(c)).replace(' ','0'));
      } else if (((int) c) < 32)
        b.append("\\u"+String.format("%4s", Integer.toHexString(c)).replace(' ','0'));
      else
        b.append(c);
    }
    return b.toString();
  }

Avoid java.lang.Character.isWhitespace if explicitly import

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/qute The template engine kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants