hints for handling complex JSON? #904
-
It's me again :) In my day job, I help support almost 6000 Kubernetes clusters running on VMware on-prem in retail locations. Each of those has between 6 and 13 nodes, with between 200 and 500 pods (depending on how 🔥 the day is). When I'm on call, that scopes out to 10s of thousands of clusters, with astronomical numbers of nodes and pods. So the scale I work with is massive. On top of that, Kubernetes and VMware JSON objects are massive and complex, with deeply nested structures that very few tools handle gracefully. I say all that as context for why I always have my eyes open for tools that will make my job easier. Murex's built-in support for structured data and your commitment to stability and backward-compatibility are why I'm here. I've already noted a few ways I find Murex easier to work with than my current daily driver (Elvish), and while it isn't as immediately familiar as Nushell, it also doesn't seem likely to introduce breaking changes as frequently. In a typical workflow, I'll grab JSON from a number of similar resources, and I'll parse those objects for further action or for human-readable display. The JSON response will often consist of an outer array with root-level key/value pairs, keys that have array values consisting of a mix of key/value pairs and other keys with array values. [I can't provide a full JSON response, I'm afraid — they're just too big for me to go through to make sure I'm not leaking info that would put me on infosec's naughty list.] A trivial example using
[ The first unexpected outcome here is that I have to cast JSON output explicitly to JSON. Murex only sees a
The Using the path approach (e.g., Same with a JSON response from vSphere:
[the As ever, I recognize I'm a n00b here, so I start with the assumption I'm just holding it wrong. :) Are there more idiomatic ways to approach parsing complex JSON like this? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Sounds like a few issues. CastingGeneric is the default output from any POSIX pipe because it's hard to assume what an output will be for tools that just pass raw byte streams. You can create a function to override that default, eg
so now your command line looks like:
I've been hesitant to do any automatic detection of data types based on their stdout/stdin because it is error prone. But if you know of a way it can be implemented in a reliable and deterministic way, then it's definitely something I can consider. Element vs IndexIndex Elements take the first character as the separator. So if we take the following path
...that could be used in element:
ForeachYour following code can be golfed quite a bit:
sharing this as a baseline ^
However if you're just working in the command line, you'll want something that's a little more ergonomic on your fingers. So the lambda
https://murex.rocks/parser/lambda.html It's not particularly readable though so I tend to discourage this for shell scripts. However working in the REPL is usually more about speed than readability I'm also working on the assumption that
which will by why you need to re-cast. ParallelismThere isn't presently a way to iterate through a loop in parallel. I'd be interested to understand your use case because it's something I could easily add (Murex is already heavily multi-threaded). Looking at Nushell, it has something called Presently you could use
That comes with several risks:
So having something managed is definitely preferable. Maybe an additional flag could work here, eg |
Beta Was this translation helpful? Give feedback.
Sounds like a few issues.
Casting
Generic is the default output from any POSIX pipe because it's hard to assume what an output will be for tools that just pass raw byte streams.
You can create a function to override that default, eg
so now your command line looks like:
I've been hesitant to do any automatic detection of data types based on their stdout/stdin because it is error prone. But if you know of a way it can be implemented in a reliable and deterministic way, then it's definitely something I can consider.
Element vs Index
Index
[ ... ]
is useful because it's case insensiti…