Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META]Add PPL JSON extended functions support #3027

Open
YANG-DB opened this issue Sep 16, 2024 · 0 comments
Open

[META]Add PPL JSON extended functions support #3027

YANG-DB opened this issue Sep 16, 2024 · 0 comments
Labels
enhancement New feature or request PPL Piped processing language

Comments

@YANG-DB
Copy link
Member

YANG-DB commented Sep 16, 2024

Is your feature request related to a problem?
We need to enhance OpenSearch PPL (Piped Processing Language) by adding support for JSON-based operations. This extension will provide users with powerful tools to manipulate and analyze JSON data within PPL queries.

Proposed New Functions

  1. json_object(): Creates a JSON object from existing key/value pairs.
   ... | eval person = json_object("name", "John", "age", 30)
  1. json(): Evaluates whether a value can be parsed as JSON. Returns the value if valid, null otherwise.
   ... | eval result = json('{"name":"John","age":30}')
  1. json_append(): Appends elements to the contents of a valid JSON object.
   ... | eval person = json_object("name", "John"), extended_person = json_append(person, "city", "New York")
  1. json_array(): Creates a JSON array using a list of values.
   ... | eval json_list = json_array("apple", "banana", "cherry")
  1. json_array_to_mv(): Maps the elements of a JSON array to a multivalued field.
   ... | eval json_list = json_array("apple", "banana", "cherry"), mv_field = json_array_to_mv(json_list)
  1. json_delete(): Removes one or more keys and their corresponding values from the specified JSON object.
   ... | eval person = json_object("name", "John", "age", 30, "city", "New York"), person_no_age = json_delete(person, "age")
  1. json_extend(): Extends the contents of a valid JSON object with the values of an array.
   ... | eval person = json_object("name", "John", "age", 30), extended_person = json_extend(person, json_array("city", "New York"))
  1. json_extract(): Returns either a JSON array or a native type value from a field and zero or more paths.
   ... | eval data = json('{"person": {"name": "John", "age": 30}}'), name = json_extract(data, "person.name")
  1. json_keys(): Returns the keys from the key-value pairs in a JSON object as a JSON array.
   ... | eval person = json_object("name", "John", "age", 30), keys = json_keys(person)
  1. json_set(): Inserts or overwrites values for a JSON node with provided values and returns an updated JSON object.
   ... | eval person = json_object("name", "John"), updated_person = json_set(person, "age", 30)
  1. json_valid(): Evaluates whether a JSON object uses valid JSON syntax and returns TRUE or FALSE.
   ... | eval is_valid = json_valid('{"name":"John","age":30}')
  1. all(): Iterates over JSON array values and returns true if every value matches the provided predicate.
   ... | eval json_list = json_array(1, 2, 3), all_match = all(json_list, val -> val != 0)
  1. any(): Iterates over JSON array values and returns true if any value matches the provided predicate.
   ... | eval json_list = json_array(1, 2, 3), any_match = any(json_list, val -> val == 0)
  1. filter(): Iterates over JSON array values and performs an operation on each value.
   ... | eval json_list = json_array(1, 2, 3, 4), filtered_list = filter(json_list, val -> val % 2 == 0)
  1. map(): Iterates over JSON array values and performs an operation on each value.
   ... | eval json_list = json_array(1, 2, 3), mapped_list = map(json_list, val -> val * 2)
  1. reduce(): Iterates over a JSON array in a field or a literal array and performs an accumulation operation.
   ... | eval json_list = json_array(1, 2, 3), sum = reduce(json_list, 0, (acc, val) -> acc + val)

Technical Considerations

  • Consider leveraging existing JSON libraries for efficient parsing and manipulation.
  • Ensure that these functions can handle large JSON objects without significant performance degradation.

Additional Context

This enhancement is inspired by similar functionality available in other query languages and will significantly improve OpenSearch's capability to handle and analyze JSON data within PPL queries.


Support for PPL extended json functionality is required for both:

OpenSearch based PPL engine

Spark based PPL engine

@YANG-DB YANG-DB added enhancement New feature or request untriaged PPL Piped processing language labels Sep 16, 2024
@YANG-DB YANG-DB changed the title [META]Add PPL Json Functions [META]Add PPL JSON Functions support Sep 16, 2024
@YANG-DB YANG-DB changed the title [META]Add PPL JSON Functions support [META]Add PPL JSON extended functions support Sep 16, 2024
@YANG-DB YANG-DB removed the untriaged label Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PPL Piped processing language
Projects
None yet
Development

No branches or pull requests

1 participant