Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]Add PPL JSON extended functions support #667

Open
YANG-DB opened this issue Sep 16, 2024 · 4 comments
Open

[FEATURE]Add PPL JSON extended functions support #667

YANG-DB opened this issue Sep 16, 2024 · 4 comments
Labels
enhancement New feature or request PPL Pipe Processing Language support

Comments

@YANG-DB
Copy link
Member

YANG-DB commented Sep 16, 2024

Is your feature request related to a problem?
We need to enhance OpenSearch PPL (Piped Processing Language) by adding support for JSON-based operations. This extension will provide users with powerful tools to manipulate and analyze JSON data within PPL queries.

Proposed New Functions

  1. json_object(): Creates a JSON object from existing key/value pairs.
   ... | eval person = json_object("name", "John", "age", 30)
  1. json(): Evaluates whether a value can be parsed as JSON. Returns the value if valid, null otherwise.
   ... | eval result = json('{"name":"John","age":30}')
  1. json_append(): Appends elements to the contents of a valid JSON object.
   ... | eval person = json_object("name", "John"), extended_person = json_append(person, "city", "New York")
  1. json_array(): Creates a JSON array using a list of values.
   ... | eval json_list = json_array("apple", "banana", "cherry")
  1. json_array_to_mv(): Maps the elements of a JSON array to a multivalued field.
   ... | eval json_list = json_array("apple", "banana", "cherry"), mv_field = json_array_to_mv(json_list)
  1. json_delete(): Removes one or more keys and their corresponding values from the specified JSON object.
   ... | eval person = json_object("name", "John", "age", 30, "city", "New York"), person_no_age = json_delete(person, "age")
  1. json_extend(): Extends the contents of a valid JSON object with the values of an array.
   ... | eval person = json_object("name", "John", "age", 30), extended_person = json_extend(person, json_array("city", "New York"))
  1. json_extract(): Returns either a JSON array or a native type value from a field and zero or more paths.
   ... | eval data = json('{"person": {"name": "John", "age": 30}}'), name = json_extract(data, "person.name")
  1. json_keys(): Returns the keys from the key-value pairs in a JSON object as a JSON array.
   ... | eval person = json_object("name", "John", "age", 30), keys = json_keys(person)
  1. json_set(): Inserts or overwrites values for a JSON node with provided values and returns an updated JSON object.
   ... | eval person = json_object("name", "John"), updated_person = json_set(person, "age", 30)
  1. json_valid(): Evaluates whether a JSON object uses valid JSON syntax and returns TRUE or FALSE.
   ... | eval is_valid = json_valid('{"name":"John","age":30}')
  1. all(): Iterates over JSON array values and returns true if every value matches the provided predicate.
   ... | eval json_list = json_array(1, 2, 3), all_match = all(json_list, val -> val != 0)
  1. any(): Iterates over JSON array values and returns true if any value matches the provided predicate.
   ... | eval json_list = json_array(1, 2, 3), any_match = any(json_list, val -> val == 0)
  1. filter(): Iterates over JSON array values and performs an operation on each value.
   ... | eval json_list = json_array(1, 2, 3, 4), filtered_list = filter(json_list, val -> val % 2 == 0)
  1. map(): Iterates over JSON array values and performs an operation on each value.
   ... | eval json_list = json_array(1, 2, 3), mapped_list = map(json_list, val -> val * 2)
  1. reduce(): Iterates over a JSON array in a field or a literal array and performs an accumulation operation.
   ... | eval json_list = json_array(1, 2, 3), sum = reduce(json_list, 0, (acc, val) -> acc + val)

Technical Considerations

  • Consider leveraging existing JSON libraries for efficient parsing and manipulation.
  • Ensure that these functions can handle large JSON objects without significant performance degradation.
@A-Gray-Cat
Copy link

@YANG-DB These features look awesome. Can we add support to access elements within an array using index too?

@YANG-DB
Copy link
Member Author

YANG-DB commented Sep 17, 2024

@A-Gray-Cat yes that is possible - can you please suggest the syntax and we can discuss it further

@A-Gray-Cat
Copy link

Thanks! E.g.

{
    "items": ["a", "b", "c"]
}

Users can access the first element of the array by using syntax like eval first_item = items[0], or it can be retrieved directly using the fields command, e.g. fields items[0].

@A-Gray-Cat
Copy link

A-Gray-Cat commented Sep 18, 2024

Hello @YANG-DB , can we also add support to filter a json object using either the key or the value, or both? E.g.

... | eval person = json_object("name", "John", "age", 30), all_match = filter(person, k,v -> k = "name")
... | eval person = json_object("name", "John", "age", 30), all_match = filter(person, k,v -> v = "John")
... | eval person = json_object("name", "John", "age", 30), all_match = filter(person, k,v -> k = "name" and v = "John")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PPL Pipe Processing Language support
Projects
Status: Todo
Development

No branches or pull requests

2 participants