From 64f42b03bd1fb7d473aa4cba7737540546c6b864 Mon Sep 17 00:00:00 2001 From: Mano Toth Date: Tue, 2 Sep 2025 11:02:54 +0200 Subject: [PATCH 1/6] First draft --- apl/tabular-operators/make-series.mdx | 169 +++++++++++++++++++++++ apl/tabular-operators/mv-expand.mdx | 167 ++++++++++++++++++++++ apl/tabular-operators/overview.mdx | 8 +- apl/tabular-operators/parse-kv.mdx | 160 +++++++++++++++++++++ apl/tabular-operators/parse-where.mdx | 156 +++++++++++++++++++++ apl/tabular-operators/project-rename.mdx | 155 +++++++++++++++++++++ docs.json | 5 + 7 files changed, 819 insertions(+), 1 deletion(-) create mode 100644 apl/tabular-operators/make-series.mdx create mode 100644 apl/tabular-operators/mv-expand.mdx create mode 100644 apl/tabular-operators/parse-kv.mdx create mode 100644 apl/tabular-operators/parse-where.mdx create mode 100644 apl/tabular-operators/project-rename.mdx diff --git a/apl/tabular-operators/make-series.mdx b/apl/tabular-operators/make-series.mdx new file mode 100644 index 00000000..92e8291a --- /dev/null +++ b/apl/tabular-operators/make-series.mdx @@ -0,0 +1,169 @@ +--- +title: make-series +description: 'This page explains how to use the make-series operator in APL.' +--- + +The `make-series` operator creates time series data by aggregating values over specified time bins. You use it to turn event-based data into evenly spaced intervals, which is useful for visualizing trends, comparing metrics over time, or performing anomaly detection. + +You find this operator useful when you want to: + +- Analyze trends in metrics such as request duration, error rates, or throughput. +- Prepare data for charting in dashboards where regular time intervals are required. +- Aggregate trace or log data into time buckets for performance monitoring or incident analysis. + +## For users of other query languages + +If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL. + + + + +In Splunk SPL, you often use the `timechart` command to create time series. In APL, you achieve the same result with the `make-series` operator, which lets you explicitly define the aggregation, time column, and binning interval. + + +```sql Splunk example +index=sample-http-logs +| timechart span=1m avg(req_duration_ms) +```` + +```kusto APL equivalent +['sample-http-logs'] +| make-series avg(req_duration_ms) default=0 on _time from ago(1h) to now() step 1m +``` + + + + + + +In ANSI SQL, you typically use `GROUP BY` with windowing functions or generated series to build time-based aggregations. In APL, the `make-series` operator is the dedicated tool for generating continuous time series with defined intervals, which avoids the need for joins with a calendar table. + + +```sql SQL example +SELECT + time_bucket('1 minute', _time) AS minute, + AVG(req_duration_ms) AS avg_duration +FROM sample_http_logs +WHERE _time > NOW() - interval '1 hour' +GROUP BY minute +ORDER BY minute +``` + +```kusto APL equivalent +['sample-http-logs'] +| make-series avg(req_duration_ms) default=0 on _time from ago(1h) to now() step 1m +``` + + + + + + +## Usage + +### Syntax + +```kusto +T | make-series [Aggregation [, ...]] + [default = DefaultValue] + on TimeColumn + [in Range] + step StepSize + [by GroupingColumn [, ...]] +``` + +### Parameters + +| Parameter | Description | +| ---------------- | --------------------------------------------------------------------------------------------------------------- | +| `Aggregation` | One or more aggregation functions (for example, `avg()`, `count()`, `sum()`) to apply over each time bin. | +| `default` | A value to use when no records exist in a time bin. | +| `TimeColumn` | The column containing timestamps used for binning. | +| `Range` | An optional range expression specifying the start and end of the series (for example, `from ago(1h) to now()`). | +| `StepSize` | The size of each time bin (for example, `1m`, `5m`, `1h`). | +| `GroupingColumn` | Optional columns to split the series by, producing multiple series in parallel. | + +### Returns + +The operator returns a table where each row represents a group (if specified), and each aggregation function produces an array of values aligned with the generated time bins. + +## Use case examples + + + + +You want to analyze how average request duration evolves over time, binned into 5-minute intervals. + +**Query** + +```kusto +['sample-http-logs'] +| make-series avg(req_duration_ms) default=0 on _time from ago(1h) to now() step 5m +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | make-series avg(req_duration_ms) default=0 on _time from ago(1h) to now() step 5m%22%7D) + +**Output** + +| avg_req_duration_ms | +| ---------------------------- | +| [123, 98, 110, 105, 130...] | + +The query produces a time series of average request durations across the last hour, grouped into 5-minute intervals. + + + + +You want to monitor average span duration per service, binned into 10-minute intervals. + +**Query** + +```kusto +['otel-demo-traces'] +| make-series avg(duration) default=0 on _time from ago(2h) to now() step 10m by ['service.name'] +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | make-series avg(duration) default=0 on _time from ago(2h) to now() step 10m by ['service.name']%22%7D) + +**Output** + +| service.name | avg_duration | +| --------------- | ------------------------------ | +| frontend | [20ms, 18ms, 22ms, 19ms, ...] | +| checkoutservice | [35ms, 40ms, 33ms, 37ms, ...] | + +The query builds parallel time series for each service, showing average span duration trends. + + + + +You want to analyze the rate of HTTP 401 errors in your logs per minute. + +**Query** + +```kusto +['sample-http-logs'] +| where status == '401' +| make-series count() default=0 on _time from ago(30m) to now() step 1m +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | where status == '401' | make-series count() default=0 on _time from ago(30m) to now() step 1m%22%7D) + +**Output** + +| count_status_401 | +| ------------------------ | +| [0, 1, 0, 2, 1, 0, ...] | + +The query generates a time series of failed login attempts (HTTP 401), grouped into 1-minute bins. + + + + +## List of related operators + +- [summarize](/apl/tabular-operators/summarize-operator): Aggregates rows into groups but does not generate continuous time bins. Use `summarize` when you want flexible grouping without forcing evenly spaced intervals. +- [bin](/apl/scalar-functions/bin-function): Rounds timestamps into buckets for grouping, often used together with `summarize`. Use `bin` when you want to group irregular data without creating missing intervals. +- [top](/apl/tabular-operators/top-operator): Returns the top rows by a specified expression, not time series. Use `top` when you want to focus on the most significant values instead of trends over time. +- [extend](/apl/tabular-operators/extend-operator): Creates new calculated columns, often as preparation before `make-series`. Use `extend` when you want to preprocess data for time series analysis. +- [mv-expand](/apl/tabular-operators/mv-expand-operator): Expands arrays into multiple rows. Use `mv-expand` to work with the arrays returned by `make-series`. diff --git a/apl/tabular-operators/mv-expand.mdx b/apl/tabular-operators/mv-expand.mdx new file mode 100644 index 00000000..a9317402 --- /dev/null +++ b/apl/tabular-operators/mv-expand.mdx @@ -0,0 +1,167 @@ +--- +title: mv-expand +description: 'This page explains how to use the mv-expand operator in APL.' +--- + +The `mv-expand` operator expands dynamic arrays and property bags into multiple rows. Each element of the array or each property of the bag becomes its own row, while other columns are duplicated. + +You use `mv-expand` when you want to analyze or filter individual values inside arrays or objects. This is especially useful when working with logs that include lists of values, OpenTelemetry traces that contain arrays of spans, or security events that group multiple attributes into one field. + +## For users of other query languages + +If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL. + + + + +In Splunk SPL, the `mvexpand` command expands multi-value fields into separate events. The APL `mv-expand` operator works in a very similar way, splitting array values into individual rows. The main difference is that APL explicitly works with dynamic arrays or property bags, while Splunk handles multi-value fields implicitly. + + +```sql Splunk example +... | mvexpand request_uri +```` + +```kusto APL equivalent +['sample-http-logs'] +| mv-expand uri +``` + + + + + + +In ANSI SQL, you use `CROSS JOIN UNNEST` or `CROSS APPLY` to flatten arrays into rows. In APL, `mv-expand` provides a simpler and more direct way to achieve the same result. + + +```sql SQL example +SELECT id, value +FROM logs +CROSS JOIN UNNEST(request_uris) AS t(value) +``` + +```kusto APL equivalent +['sample-http-logs'] +| mv-expand uri +``` + + + + + + +## Usage + +### Syntax + +```kusto +T | mv-expand ColumnName [to typeof(DataType)] [limit=N] +``` + +### Parameters + +| Parameter | Description | +| --------------------- | ------------------------------------------------------------------------------- | +| `ColumnName` | The name of the column that contains a dynamic array or property bag to expand. | +| `to typeof(DataType)` | Optional. Converts each expanded element to the specified type. | +| `limit=N` | Optional. Limits the number of expanded rows per record. | + +### Returns + +The operator returns a table where each element of the expanded column is placed in its own row. Other columns are duplicated for each expanded row. + +## Use case examples + + + + +When analyzing logs, request URIs can sometimes be stored as arrays. You can use `mv-expand` to expand them into individual rows for easier filtering. + +**Query** + +```kusto +['sample-http-logs'] +| mv-expand uri +| summarize count() by uri +| top 5 by count_ +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | mv-expand uri | summarize count() by uri | top 5 by count_%22%7D) + +**Output** + +| uri | count_ | +| ---------------- | ------- | +| /api/v1/products | 1200 | +| /api/v1/cart | 950 | +| /api/v1/checkout | 720 | +| /api/v1/login | 650 | +| /api/v1/profile | 500 | + +This query expands the `uri` array into rows and counts the most frequent request URIs. + + + + +Traces often include multiple span IDs grouped under a single trace. You can use `mv-expand` to expand span IDs for detailed analysis. + +**Query** + +```kusto +['otel-demo-traces'] +| mv-expand span_id +| summarize count() by ['service.name'] +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | mv-expand span_id | summarize count() by ['service.name']%22%7D) + +**Output** + +| service.name | count_ | +| --------------------- | ------- | +| frontend | 4100 | +| cartservice | 3800 | +| checkoutservice | 3600 | +| productcatalogservice | 3400 | +| loadgenerator | 3000 | + +This query expands the `span_id` field to count how many spans each service generates. + + + + +In security logs, user IDs can appear as arrays if multiple accounts are affected by a single event. `mv-expand` helps isolate them for per-user inspection. + +**Query** + +```kusto +['sample-http-logs'] +| mv-expand id +| summarize count() by id, status +| top 5 by count_ +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | mv-expand id | summarize count() by id, status | top 5 by count_%22%7D) + +**Output** + +| id | status | count_ | +| ------- | ------ | ------- | +| user123 | 401 | 320 | +| user456 | 403 | 250 | +| user789 | 200 | 220 | +| user111 | 500 | 180 | +| user222 | 404 | 150 | + +This query expands the `id` array into rows and counts how often each user ID appears with a given status. + + + + +## List of related operators + +- [expand](/apl/tabular-operators/expand-operator): Expands JSON objects into rows. Use it when you want to expand structured data, not arrays. +- [project](/apl/tabular-operators/project-operator): Selects or computes columns. Use it when you want to reshape data, not expand arrays. +- [summarize](/apl/tabular-operators/summarize-operator): Aggregates data across rows. Use it after expanding arrays to compute statistics. +- [top](/apl/tabular-operators/top-operator): Returns the top N rows by expression. Use it after expansion to find the most frequent values. +- [parse_json](/apl/scalar-functions/parse-json-function): Converts JSON strings to dynamic objects. Use it before `mv-expand` when working with string-encoded arrays. diff --git a/apl/tabular-operators/overview.mdx b/apl/tabular-operators/overview.mdx index 2fc4a171..1756415e 100644 --- a/apl/tabular-operators/overview.mdx +++ b/apl/tabular-operators/overview.mdx @@ -32,4 +32,10 @@ The table summarizes the tabular operators available in APL. | [take](/apl/tabular-operators/take-operator) | Returns the specified number of rows from the dataset. | | [top](/apl/tabular-operators/top-operator) | Returns the top N rows from the dataset based on the specified sorting criteria. | | [union](/apl/tabular-operators/union-operator) | Returns all rows from the specified tables or queries. | -| [where](/apl/tabular-operators/where-operator) | Returns a filtered dataset containing only the rows where the condition evaluates to true. | \ No newline at end of file +| [where](/apl/tabular-operators/where-operator) | Returns a filtered dataset containing only the rows where the condition evaluates to true. | + +| [make-series](/apl/tabular-operators/make-series) | Returns a dataset where the specified field is aggregated into a time series. | +| [mv-expand](/apl/tabular-operators/mv-expand) | Returns a dataset where the specified field is expanded into multiple rows. | +| [parse-kv](/apl/tabular-operators/parse-kv) | Returns a dataset where key-value pairs are extracted from a specified field. | +| [parse-where](/apl/tabular-operators/parse-where) | Returns a dataset where the specified field is parsed according to the specified pattern. | +| [project-rename](/apl/tabular-operators/project-rename) | Returns a dataset where the specified field is renamed according to the specified pattern. | diff --git a/apl/tabular-operators/parse-kv.mdx b/apl/tabular-operators/parse-kv.mdx new file mode 100644 index 00000000..c10d92e8 --- /dev/null +++ b/apl/tabular-operators/parse-kv.mdx @@ -0,0 +1,160 @@ +--- +title: parse-kv +description: 'This page explains how to use the parse-kv operator in APL.' +--- + +The `parse-kv` operator extracts key–value pairs from text into structured columns. Use this operator when your dataset contains unstructured or semi-structured logs where information is embedded in free-form text, such as query strings, headers, or attributes. Instead of manually parsing substrings, `parse-kv` automatically splits the input into keys and values based on delimiters that you specify. + +You find this operator useful in scenarios such as analyzing log entries with embedded metadata, extracting attributes from OpenTelemetry spans, or parsing security event messages where structured fields are embedded inside a single string. + +## For users of other query languages + +If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL. + + + + +In Splunk SPL, you typically use the `kv` or `extract` command to pull key–value fields from raw events. In APL, `parse-kv` serves the same purpose but offers explicit control over delimiters and the ability to map extracted keys into new columns. + + + +```sql Splunk example +... | kv pairdelim=" " kvdelim="=" +```` + +```kusto APL equivalent +['sample-http-logs'] +| extend parsed = parse-kv('uri', pair_delimiter='&', kv_delimiter='=') +``` + + + + + + +ANSI SQL does not include a built-in function to parse arbitrary key–value pairs from text. You typically rely on `SUBSTRING`, `SPLIT`, or regular expressions. In APL, `parse-kv` encapsulates this functionality in one operator, making key–value extraction more concise and efficient. + + + +```sql SQL example +SELECT + SUBSTRING_INDEX(SUBSTRING_INDEX(uri, '&', 1), '=', -1) AS key1, + SUBSTRING_INDEX(SUBSTRING_INDEX(uri, '&', 2), '=', -1) AS key2 +FROM logs +``` + +```kusto APL equivalent +['sample-http-logs'] +| extend parsed = parse-kv('uri', pair_delimiter='&', kv_delimiter='=') +``` + + + + + + +## Usage + +### Syntax + +```kusto +parse-kv(Expression [, pair_delimiter [, kv_delimiter [, quote_delimiter]]]) +``` + +### Parameters + +| Parameter | Type | Description | +| ----------------- | -------- | ------------------------------------------------------------------------------------ | +| `Expression` | `string` | The string expression that contains key–value pairs. | +| `pair_delimiter` | `string` | Character that separates key–value pairs. Default is a space (`' '`). | +| `kv_delimiter` | `string` | Character that separates keys from values. Default is `'='`. | +| `quote_delimiter` | `string` | Character that surrounds values with spaces or special characters. Default is `'"'`. | + +### Returns + +A dynamic object containing the extracted key–value pairs. You can project individual keys as new columns or keep the object for further processing. + +## Use case examples + + + + +You can extract query parameters from HTTP request URIs to analyze how users interact with your service. + +**Query** + +```kusto +['sample-http-logs'] +| extend kv = parse-kv(uri, pair_delimiter='&', kv_delimiter='=') +| project _time, id, method, kv +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | extend kv = parse-kv(uri, pair_delimiter='&', kv_delimiter='=') | project _time, id, method, kv%22%7D) + +**Output** + +| _time | id | method | kv | +| -------------------- | ---- | ------ | --------------------------------- | +| 2025-09-01T10:01:00Z | u123 | GET | {"q":"search","page":"1"} | +| 2025-09-01T10:02:00Z | u456 | POST | {"action":"login","next":"/home"} | + +This query parses the query string parameters in the `uri` field and projects them into a structured dynamic object. + + + + +You can extract attributes embedded in trace span IDs or metadata. + +**Query** + +```kusto +['otel-demo-traces'] +| extend kv = parse-kv(span_id, pair_delimiter='-', kv_delimiter=':') +| project _time, trace_id, ['service.name'], kv +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | extend kv = parse-kv(span_id, pair_delimiter='-', kv_delimiter=':') | project _time, trace_id, ['service.name'], kv%22%7D) + +**Output** + +| _time | trace_id | service.name | kv | +| -------------------- | --------- | --------------- | ----------------------------- | +| 2025-09-01T10:03:00Z | t789 | frontend | {"part1":"abc","part2":"xyz"} | +| 2025-09-01T10:04:00Z | t101 | checkoutservice | {"part1":"def","part2":"uvw"} | + +This query breaks down structured identifiers in `span_id` into key–value components for easier analysis. + + + + +You can parse structured information inside log messages, such as extracting action and result fields. + +**Query** + +```kusto +['sample-http-logs'] +| extend kv = parse-kv(uri, pair_delimiter=';', kv_delimiter='=') +| project _time, status, ['geo.country'], kv +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | extend kv = parse-kv(uri, pair_delimiter=';', kv_delimiter='=') | project _time, status, ['geo.country'], kv%22%7D) + +**Output** + +| _time | status | geo.country | kv | +| -------------------- | ------ | ----------- | ------------------------------------- | +| 2025-09-01T10:05:00Z | 403 | US | {"action":"access","result":"denied"} | +| 2025-09-01T10:06:00Z | 200 | DE | {"action":"login","result":"success"} | + +This query parses embedded key–value fields from the `uri` to identify blocked and successful login attempts by country. + + + + +## List of related operators + +- [parse-json](/apl/scalar-functions/parse-json-function): Use when the string is already in JSON format. More efficient than `parse-kv` for JSON data. +- [extract](/apl/scalar-functions/extract-function): Use for regular expression-based extraction when data is less structured. +- [split](/apl/scalar-functions/split-function): Use when you need to split strings into arrays instead of key–value objects. +- [mv-expand](/apl/tabular-operators/mv-expand-operator): Use to expand dynamic arrays or property bags into separate rows after parsing. +- [project](/apl/tabular-operators/project-operator): Use to select and rename specific extracted keys for further analysis. diff --git a/apl/tabular-operators/parse-where.mdx b/apl/tabular-operators/parse-where.mdx new file mode 100644 index 00000000..bd4e4880 --- /dev/null +++ b/apl/tabular-operators/parse-where.mdx @@ -0,0 +1,156 @@ +--- +title: parse-where +description: 'This page explains how to use the parse-where operator in APL.' +--- + +The `parse-where` operator lets you extract structured values from a text expression and filter rows where the parsing succeeds. You use it when your data contains unstructured or semi-structured text, and you want to match and filter rows based on whether they fit a specific format. Unlike `parse`, which extracts fields into new columns, `parse-where` focuses on filtering rows without retaining the parsed values. + +This operator is useful when you work with logs, traces, or event data that contain patterns such as HTTP requests, error messages, or identifiers. By applying `parse-where`, you can reduce large datasets to only the rows that match your format of interest. + +## For users of other query languages + +If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL. + + + + +In Splunk SPL, you use the `rex` command to extract fields from raw text and filter events. In APL, the `parse-where` operator serves a similar role, but instead of extracting values, it filters rows to those where parsing succeeds. You don’t need to keep the extracted fields if you only want to filter. + + +```sql Splunk example +... | rex field=_raw "uri=(?/api/[a-z]+)" +```` + +```kusto APL equivalent +['sample-http-logs'] +| parse-where uri with '/api/' endpoint +``` + + + + + + +In SQL, you often combine string functions with a `WHERE` clause to test patterns, for example with `LIKE` or `REGEXP`. In APL, you can achieve similar filtering with `parse-where`, which only keeps rows where the pattern match succeeds. Unlike SQL, you don’t have to write separate conditions—the operator handles both parsing and filtering. + + +```sql SQL example +SELECT * +FROM sample_http_logs +WHERE uri LIKE '/api/%' +``` + +```kusto APL equivalent +['sample-http-logs'] +| parse-where uri with '/api/' endpoint +``` + + + + + + +## Usage + +### Syntax + +```kusto +T | parse-where Expression with * stringConstant columnName * [stringConstant columnName ...] +``` + +### Parameters + +| Parameter | Type | Description | +| ---------------- | ------ | -------------------------------------------------- | +| `Expression` | string | The text expression to parse. | +| `stringConstant` | string | A fixed text value that defines the parse pattern. | +| `columnName` | string | A placeholder for the parsed value in the pattern. | + +### Returns + +The operator returns only those rows where the expression matches the specified parse pattern. It does not keep or project the parsed values. + +## Use case examples + + + + +When you analyze HTTP logs, you can use `parse-where` to filter only the rows where a URL matches a specific pattern. + +**Query** + +```kusto +['sample-http-logs'] +| parse-where uri with '/api/' endpoint +| project _time, id, method, uri, status +``` + +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20parse-where%20uri%20with%20'/api/'%20endpoint%20%7C%20project%20_time%2C%20id%2C%20method%2C%20uri%2C%20status%22%7D) + +**Output** + +| _time | id | method | uri | status | +| -------------------- | ------ | ------ | ------------- | ------ | +| 2025-01-01T12:00:00Z | user42 | GET | /api/products | 200 | +| 2025-01-01T12:01:00Z | user17 | POST | /api/cart | 201 | + +This query keeps only log entries where the `uri` field starts with `/api/`. + + + + +In traces, you can use `parse-where` to filter spans whose service name follows a particular prefix. + +**Query** + +```kusto +['otel-demo-traces'] +| parse-where ['service.name'] with 'frontend' suffix +| project _time, trace_id, span_id, ['service.name'], duration +``` + +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20parse-where%20%5B'service.name'%5D%20with%20'frontend'%20suffix%20%7C%20project%20_time%2C%20trace_id%2C%20span_id%2C%20%5B'service.name'%5D%2C%20duration%22%7D) + +**Output** + +| _time | trace_id | span_id | service.name | duration | +| -------------------- | --------- | -------- | ------------- | ---------------- | +| 2025-01-01T12:00:00Z | abc123 | span01 | frontend | 00:00:01.2000000 | +| 2025-01-01T12:00:01Z | def456 | span02 | frontendproxy | 00:00:00.5000000 | + +This query filters spans where the `service.name` starts with `frontend`. + + + + +In security analysis, you can filter logs to only those where the request URI matches a suspicious pattern. + +**Query** + +```kusto +['sample-http-logs'] +| parse-where uri with '/admin/' path +| project _time, id, method, uri, status, ['geo.country'] +``` + +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20parse-where%20uri%20with%20'/admin/'%20path%20%7C%20project%20_time%2C%20id%2C%20method%2C%20uri%2C%20status%2C%20%5B'geo.country'%5D%22%7D) + +**Output** + +| _time | id | method | uri | status | geo.country | +| -------------------- | ------ | ------ | ------------- | ------ | ----------- | +| 2025-01-01T13:00:00Z | user99 | GET | /admin/login | 403 | US | +| 2025-01-01T13:05:00Z | user22 | POST | /admin/config | 500 | DE | + +This query keeps only suspicious requests that access admin endpoints. + + + + +## List of related operators + +- [parse](/apl/tabular-operators/parse-operator): Extracts structured data into columns instead of filtering rows. Use it when you want to keep the parsed values. +- [extract](/apl/scalar-functions/extract-function): Extracts data from text using regular expressions. Use it when you need flexible pattern matching and want to capture values. +- [where](/apl/tabular-operators/where-operator): Filters rows based on Boolean expressions. Use it for direct conditions instead of pattern parsing. +- [search](/apl/tabular-operators/search-operator): Searches across multiple columns for specific text. Use it for broader keyword searches. +- [project](/apl/tabular-operators/project-operator): Selects and computes specific columns. Use it after filtering to refine the result set. diff --git a/apl/tabular-operators/project-rename.mdx b/apl/tabular-operators/project-rename.mdx new file mode 100644 index 00000000..9a90f1e1 --- /dev/null +++ b/apl/tabular-operators/project-rename.mdx @@ -0,0 +1,155 @@ +--- +title: project-rename +description: 'This page explains how to use the project-rename operator in APL.' +--- + +The `project-rename` operator in APL lets you rename columns in a dataset while keeping all existing rows intact. You can use it when you want to make column names clearer, align them with naming conventions, or prepare data for downstream processing. Unlike `project`, which also controls which columns appear in the result, `project-rename` only changes the names of selected columns and keeps the full set of columns in the dataset. + +You find this operator useful when: +- You want to standardize field names across multiple queries. +- You want to replace long or inconsistent column names with simpler ones. +- You want to improve query readability without altering the underlying data. + +## For users of other query languages + +If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL. + + + + +In Splunk SPL, renaming fields uses the `rename` command. The `project-rename` operator in APL works in a similar way. Both let you map existing fields to new names without altering the dataset content. + + +```sql Splunk example +... | rename uri AS url, status AS http_status +```` + +```kusto APL equivalent +['sample-http-logs'] +| project-rename url = uri, http_status = status +``` + + + + + + +In ANSI SQL, renaming columns is done with `AS` in a `SELECT` statement. In APL, `project-rename` is the closest equivalent, but unlike SQL, it preserves all columns by default while renaming only the specified ones. + + +```sql SQL example +SELECT uri AS url, status AS http_status, method, id +FROM sample_http_logs; +``` + +```kusto APL equivalent +['sample-http-logs'] +| project-rename url = uri, http_status = status +``` + + + + + + +## Usage + +### Syntax + +```kusto +Table +| project-rename NewName1 = OldName1, NewName2 = OldName2, ... +``` + +### Parameters + +| Name | Type | Description | +| --------- | ------ | --------------------------------------- | +| `NewName` | string | The new column name you want to assign. | +| `OldName` | string | The existing column name to rename. | + +### Returns + +A dataset with the same rows and columns as the input, except that the specified columns have new names. + +## Use case examples + + + + +When analyzing HTTP logs, you might want to rename fields to shorter or more descriptive names before creating dashboards or reports. + +**Query** + +```kusto +['sample-http-logs'] +| project-rename city = ['geo.city'], country = ['geo.country'] +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | project-rename city = ['geo.city'], country = ['geo.country']%22%7D) + +**Output** + +| _time | req_duration_ms | id | status | uri | method | city | country | +| -------------------- | ----------------- | ----- | ------ | ------ | ------ | ------ | ------- | +| 2025-09-01T10:00:00Z | 120 | user1 | 200 | /home | GET | Paris | FR | +| 2025-09-01T10:01:00Z | 85 | user2 | 404 | /about | GET | Berlin | DE | + +This query renames the `geo.city` and `geo.country` fields to `city` and `country` for easier use in queries. + + + + +When inspecting distributed traces, you can rename service-related fields to match your internal naming conventions. + +**Query** + +```kusto +['otel-demo-traces'] +| project-rename service = ['service.name'], code = status_code +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | project-rename service = ['service.name'], code = status_code%22%7D) + +**Output** + +| _time | duration | span_id | trace_id | service | kind | code | +| -------------------- | ------------ | -------- | --------- | --------------- | ------ | ---- | +| 2025-09-01T09:55:00Z | 00:00:01.200 | abc123 | trace789 | frontend | server | 200 | +| 2025-09-01T09:56:00Z | 00:00:00.450 | def456 | trace790 | checkoutservice | client | 500 | + +This query renames `service.name` to `service` and `status_code` to `code`, making them shorter for downstream filtering. + + + + +For security-related HTTP log analysis, you can rename status and URI fields to match existing security dashboards. + +**Query** + +```kusto +['sample-http-logs'] +| project-rename http_status = status, url = uri +``` + +[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | project-rename http_status = status, url = uri%22%7D) + +**Output** + +| _time | req_duration_ms | id | http_status | url | method | ['geo.city'] | ['geo.country'] | +| -------------------- | ----------------- | ----- | ------------ | ------ | ------ | ------------- | ---------------- | +| 2025-09-01T11:00:00Z | 150 | user5 | 403 | /admin | POST | Madrid | ES | +| 2025-09-01T11:02:00Z | 200 | user6 | 500 | /login | POST | Rome | IT | + +This query renames `status` to `http_status` and `uri` to `url`, making the output align with security alerting systems. + + + + +## List of related operators + +- [project](/apl/tabular-operators/project-operator): Lets you select and rename columns at the same time. Use it when you want to control which columns appear in the result. +- [extend](/apl/tabular-operators/extend-operator): Creates new calculated columns. Use it when you want to add columns rather than rename existing ones. +- [project-away](/apl/tabular-operators/project-away-operator): Removes specific columns from the dataset. Use it when you want to drop columns rather than rename them. +- [rename](/apl/tabular-operators/rename-operator): Renames columns without projecting. Similar to `project-rename`, but does not allow simultaneous projection. Use `project-rename` when you want flexible renaming in the projection pipeline. +- [summarize](/apl/tabular-operators/summarize-operator): Aggregates data into groups. Use it when you want to compute metrics rather than adjust column names. diff --git a/docs.json b/docs.json index a2181de7..3b19b390 100644 --- a/docs.json +++ b/docs.json @@ -472,11 +472,16 @@ "apl/tabular-operators/join-operator", "apl/tabular-operators/limit-operator", "apl/tabular-operators/lookup-operator", + "apl/tabular-operators/make-series", + "apl/tabular-operators/mv-expand", "apl/tabular-operators/order-operator", "apl/tabular-operators/parse-operator", + "apl/tabular-operators/parse-kv", + "apl/tabular-operators/parse-where", "apl/tabular-operators/project-operator", "apl/tabular-operators/project-away-operator", "apl/tabular-operators/project-keep-operator", + "apl/tabular-operators/project-rename", "apl/tabular-operators/project-reorder-operator", "apl/tabular-operators/redact-operator", "apl/tabular-operators/sample-operator", From 88bda2fdbd2a84d05374f8b8276dcedd53932ffa Mon Sep 17 00:00:00 2001 From: Mano Toth Date: Tue, 2 Sep 2025 12:11:24 +0200 Subject: [PATCH 2/6] Fixes --- apl/apl-features.mdx | 4 + .../conversion-functions/toarray.mdx | 2 +- apl/tabular-operators/make-series.mdx | 31 +++-- apl/tabular-operators/mv-expand.mdx | 24 ++-- apl/tabular-operators/overview.mdx | 10 +- apl/tabular-operators/parse-kv.mdx | 116 ++++++++++-------- apl/tabular-operators/parse-where.mdx | 116 ++++++++++-------- apl/tabular-operators/project-rename.mdx | 23 ++-- 8 files changed, 178 insertions(+), 148 deletions(-) diff --git a/apl/apl-features.mdx b/apl/apl-features.mdx index ea8e5210..b2aed804 100644 --- a/apl/apl-features.mdx +++ b/apl/apl-features.mdx @@ -279,8 +279,12 @@ keywords: ['axiom documentation', 'documentation', 'axiom', 'APL', 'axiom proces | Tabular operator | [join](/apl/tabular-operators/join-operator) | Returns a dataset containing rows from two different tables based on conditions. | | Tabular operator | [limit](/apl/tabular-operators/limit-operator) | Returns the top N rows from the input dataset. | | Tabular operator | [lookup](/apl/tabular-operators/lookup-operator) | Returns a dataset where rows from one dataset are enriched with matching columns from a lookup table based on conditions. | +| Tabular operator | [make-series](/apl/tabular-operators/make-series) | Returns a dataset where the specified field is aggregated into a time series. | +| Tabular operator | [mv-expand](/apl/tabular-operators/mv-expand) | Returns a dataset where the specified field is expanded into multiple rows. | | Tabular operator | [order](/apl/tabular-operators/order-operator) | Returns the input dataset, sorted according to the specified fields and order. | | Tabular operator | [parse](/apl/tabular-operators/parse-operator) | Returns the input dataset with new fields added based on the specified parsing pattern. | +| Tabular operator | [parse-kv](/apl/tabular-operators/parse-kv) | Returns a dataset where key-value pairs are extracted from a string field into individual columns. | +| Tabular operator | [parse-where](/apl/tabular-operators/parse-where) | Returns a dataset where values from a string are extracted based on a pattern. | | Tabular operator | [project-away](/apl/tabular-operators/project-away-operator) | Returns the input dataset excluding the specified fields. | | Tabular operator | [project-keep](/apl/tabular-operators/project-keep-operator) | Returns a dataset with only the specified fields. | | Tabular operator | [project-reorder](/apl/tabular-operators/project-reorder-operator) | Returns a table with the specified fields reordered as requested followed by any unspecified fields in their original order. | diff --git a/apl/scalar-functions/conversion-functions/toarray.mdx b/apl/scalar-functions/conversion-functions/toarray.mdx index faf85e06..f673aa7d 100644 --- a/apl/scalar-functions/conversion-functions/toarray.mdx +++ b/apl/scalar-functions/conversion-functions/toarray.mdx @@ -3,7 +3,7 @@ title: toarray description: 'This page explains how to use the toarray function in APL.' --- -Use the `toarray` function in APL to convert a dynamic-typed input—such as a bag, property bag, or JSON array—into a regular array. This is helpful when you want to process the elements individually with array functions like `array_length`, `array_index_of`, or `mv-expand`. +Use the `toarray` function in APL to convert a dynamic-typed input—such as a bag, property bag, or JSON array—into a regular array. This is helpful when you want to process the elements individually with array functions like `array_length` or `array_index_of`. You typically use `toarray` when working with semi-structured data, especially after parsing JSON from log fields or external sources. It lets you access and manipulate nested collections using standard array operations. diff --git a/apl/tabular-operators/make-series.mdx b/apl/tabular-operators/make-series.mdx index 92e8291a..85eac181 100644 --- a/apl/tabular-operators/make-series.mdx +++ b/apl/tabular-operators/make-series.mdx @@ -18,7 +18,7 @@ If you come from other query languages, this section explains how to adjust your -In Splunk SPL, you often use the `timechart` command to create time series. In APL, you achieve the same result with the `make-series` operator, which lets you explicitly define the aggregation, time column, and binning interval. +In Splunk SPL, you often use the `timechart` command to create time series. In APL, you achieve the same result with the `make-series` operator, which lets you explicitly define the aggregation, time field, and binning interval. ```sql Splunk example @@ -64,12 +64,12 @@ ORDER BY minute ### Syntax ```kusto -T | make-series [Aggregation [, ...]] +make-series [Aggregation [, ...]] [default = DefaultValue] - on TimeColumn + on TimeField [in Range] step StepSize - [by GroupingColumn [, ...]] + [by GroupingField [, ...]] ``` ### Parameters @@ -78,10 +78,10 @@ T | make-series [Aggregation [, ...]] | ---------------- | --------------------------------------------------------------------------------------------------------------- | | `Aggregation` | One or more aggregation functions (for example, `avg()`, `count()`, `sum()`) to apply over each time bin. | | `default` | A value to use when no records exist in a time bin. | -| `TimeColumn` | The column containing timestamps used for binning. | +| `TimeField` | The field containing timestamps used for binning. | | `Range` | An optional range expression specifying the start and end of the series (for example, `from ago(1h) to now()`). | | `StepSize` | The size of each time bin (for example, `1m`, `5m`, `1h`). | -| `GroupingColumn` | Optional columns to split the series by, producing multiple series in parallel. | +| `GroupingField` | Optional fields to split the series by, producing multiple series in parallel. | ### Returns @@ -101,7 +101,7 @@ You want to analyze how average request duration evolves over time, binned into | make-series avg(req_duration_ms) default=0 on _time from ago(1h) to now() step 5m ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | make-series avg(req_duration_ms) default=0 on _time from ago(1h) to now() step 5m%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20make-series%20avg(req_duration_ms)%20default%3D0%20on%20_time%20from%20ago(1h)%20to%20now()%20step%205m%22%7D) **Output** @@ -123,31 +123,31 @@ You want to monitor average span duration per service, binned into 10-minute int | make-series avg(duration) default=0 on _time from ago(2h) to now() step 10m by ['service.name'] ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | make-series avg(duration) default=0 on _time from ago(2h) to now() step 10m by ['service.name']%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20make-series%20avg(duration)%20default%3D0%20on%20_time%20from%20ago(2h)%20to%20now()%20step%2010m%20by%20%5B'service.name'%5D%22%7D) **Output** | service.name | avg_duration | | --------------- | ------------------------------ | | frontend | [20ms, 18ms, 22ms, 19ms, ...] | -| checkoutservice | [35ms, 40ms, 33ms, 37ms, ...] | +| checkout | [35ms, 40ms, 33ms, 37ms, ...] | The query builds parallel time series for each service, showing average span duration trends. -You want to analyze the rate of HTTP 401 errors in your logs per minute. +You want to analyze the rate of HTTP 500 errors in your logs per minute. **Query** ```kusto ['sample-http-logs'] -| where status == '401' +| where status == '500' | make-series count() default=0 on _time from ago(30m) to now() step 1m ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | where status == '401' | make-series count() default=0 on _time from ago(30m) to now() step 1m%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20where%20status%20%3D%3D%20'500'%20%7C%20make-series%20count()%20default%3D0%20on%20_time%20from%20ago(30m)%20to%20now()%20step%201m%22%7D) **Output** @@ -155,15 +155,14 @@ You want to analyze the rate of HTTP 401 errors in your logs per minute. | ------------------------ | | [0, 1, 0, 2, 1, 0, ...] | -The query generates a time series of failed login attempts (HTTP 401), grouped into 1-minute bins. +The query generates a time series of HTTP 500 error counts, grouped into 1-minute bins. ## List of related operators +- [extend](/apl/tabular-operators/extend-operator): Creates new calculated fields, often as preparation before `make-series`. Use `extend` when you want to preprocess data for time series analysis. +- [mv-expand](/apl/tabular-operators/mv-expand): Expands arrays into multiple rows. Use `mv-expand` to work with the arrays returned by `make-series`. - [summarize](/apl/tabular-operators/summarize-operator): Aggregates rows into groups but does not generate continuous time bins. Use `summarize` when you want flexible grouping without forcing evenly spaced intervals. -- [bin](/apl/scalar-functions/bin-function): Rounds timestamps into buckets for grouping, often used together with `summarize`. Use `bin` when you want to group irregular data without creating missing intervals. - [top](/apl/tabular-operators/top-operator): Returns the top rows by a specified expression, not time series. Use `top` when you want to focus on the most significant values instead of trends over time. -- [extend](/apl/tabular-operators/extend-operator): Creates new calculated columns, often as preparation before `make-series`. Use `extend` when you want to preprocess data for time series analysis. -- [mv-expand](/apl/tabular-operators/mv-expand-operator): Expands arrays into multiple rows. Use `mv-expand` to work with the arrays returned by `make-series`. diff --git a/apl/tabular-operators/mv-expand.mdx b/apl/tabular-operators/mv-expand.mdx index a9317402..4cf5a907 100644 --- a/apl/tabular-operators/mv-expand.mdx +++ b/apl/tabular-operators/mv-expand.mdx @@ -55,20 +55,28 @@ CROSS JOIN UNNEST(request_uris) AS t(value) ### Syntax ```kusto -T | mv-expand ColumnName [to typeof(DataType)] [limit=N] +mv-expand [kind=(bag|array)] [with_itemindex=IndexFieldName] FieldName [to typeof(Typename)] [limit Rowlimit] +``` + +### Example + +```kusto +mv-expand kind=array tags to typeof(string) limit 1000 ``` ### Parameters -| Parameter | Description | -| --------------------- | ------------------------------------------------------------------------------- | -| `ColumnName` | The name of the column that contains a dynamic array or property bag to expand. | -| `to typeof(DataType)` | Optional. Converts each expanded element to the specified type. | -| `limit=N` | Optional. Limits the number of expanded rows per record. | +| Parameter | Description | +| -------------------------------- | -------------------------------------------------------------------------------------- | +| `kind` | Optional. Specifies whether the column is a bag (object) or an array. Defaults to `array`. | +| `with_itemindex=IndexFieldName` | Optional. Outputs an additional column with the zero-based index of the expanded item. | +| `FieldName` | Required. The name of the column that contains an array or object to expand. | +| `to typeof(Typename)` | Optional. Converts each expanded element to the specified type. | +| `limit Rowlimit` | Optional. Limits the number of expanded rows per record. | ### Returns -The operator returns a table where each element of the expanded column is placed in its own row. Other columns are duplicated for each expanded row. +The operator returns a table where each element of the expanded array or each property of the expanded object is placed in its own row. Other columns are duplicated for each expanded row. ## Use case examples @@ -160,8 +168,6 @@ This query expands the `id` array into rows and counts how often each user ID ap ## List of related operators -- [expand](/apl/tabular-operators/expand-operator): Expands JSON objects into rows. Use it when you want to expand structured data, not arrays. - [project](/apl/tabular-operators/project-operator): Selects or computes columns. Use it when you want to reshape data, not expand arrays. - [summarize](/apl/tabular-operators/summarize-operator): Aggregates data across rows. Use it after expanding arrays to compute statistics. - [top](/apl/tabular-operators/top-operator): Returns the top N rows by expression. Use it after expansion to find the most frequent values. -- [parse_json](/apl/scalar-functions/parse-json-function): Converts JSON strings to dynamic objects. Use it before `mv-expand` when working with string-encoded arrays. diff --git a/apl/tabular-operators/overview.mdx b/apl/tabular-operators/overview.mdx index 1756415e..3fa6e298 100644 --- a/apl/tabular-operators/overview.mdx +++ b/apl/tabular-operators/overview.mdx @@ -18,11 +18,16 @@ The table summarizes the tabular operators available in APL. | [join](/apl/tabular-operators/join-operator) | Returns a dataset containing rows from two different tables based on conditions. | | [limit](/apl/tabular-operators/limit-operator) | Returns the top N rows from the input dataset. | | [lookup](/apl/tabular-operators/lookup-operator) | Returns a dataset where rows from one dataset are enriched with matching columns from a lookup table based on conditions. | +| [make-series](/apl/tabular-operators/make-series) | Returns a dataset where the specified field is aggregated into a time series. | +| [mv-expand](/apl/tabular-operators/mv-expand) | Returns a dataset where the specified field is expanded into multiple rows. | | [order](/apl/tabular-operators/order-operator) | Returns the input dataset, sorted according to the specified fields and order. | | [parse](/apl/tabular-operators/parse-operator) | Returns the input dataset with new fields added based on the specified parsing pattern. | +| [parse-kv](/apl/tabular-operators/parse-kv) | Returns a dataset where key-value pairs are extracted from a string field into individual columns. | +| [parse-where](/apl/tabular-operators/parse-where) | Returns a dataset where values from a string are extracted based on a pattern. | | [project](/apl/tabular-operators/project-operator) | Returns a dataset containing only the specified fields. | | [project-away](/apl/tabular-operators/project-away-operator) | Returns the input dataset excluding the specified fields. | | [project-keep](/apl/tabular-operators/project-keep-operator) | Returns a dataset with only the specified fields. | +| [project-rename](/apl/tabular-operators/project-rename) | Returns a dataset where the specified field is renamed according to the specified pattern. | | [project-reorder](/apl/tabular-operators/project-reorder-operator) | Returns a table with the specified fields reordered as requested followed by any unspecified fields in their original order. | | [redact](/apl/tabular-operators/redact-operator) | Returns the input dataset with sensitive data replaced or hashed. | | [sample](/apl/tabular-operators/sample-operator) | Returns a table containing the specified number of rows, selected randomly from the input dataset. | @@ -34,8 +39,3 @@ The table summarizes the tabular operators available in APL. | [union](/apl/tabular-operators/union-operator) | Returns all rows from the specified tables or queries. | | [where](/apl/tabular-operators/where-operator) | Returns a filtered dataset containing only the rows where the condition evaluates to true. | -| [make-series](/apl/tabular-operators/make-series) | Returns a dataset where the specified field is aggregated into a time series. | -| [mv-expand](/apl/tabular-operators/mv-expand) | Returns a dataset where the specified field is expanded into multiple rows. | -| [parse-kv](/apl/tabular-operators/parse-kv) | Returns a dataset where key-value pairs are extracted from a specified field. | -| [parse-where](/apl/tabular-operators/parse-where) | Returns a dataset where the specified field is parsed according to the specified pattern. | -| [project-rename](/apl/tabular-operators/project-rename) | Returns a dataset where the specified field is renamed according to the specified pattern. | diff --git a/apl/tabular-operators/parse-kv.mdx b/apl/tabular-operators/parse-kv.mdx index c10d92e8..ec048016 100644 --- a/apl/tabular-operators/parse-kv.mdx +++ b/apl/tabular-operators/parse-kv.mdx @@ -3,9 +3,9 @@ title: parse-kv description: 'This page explains how to use the parse-kv operator in APL.' --- -The `parse-kv` operator extracts key–value pairs from text into structured columns. Use this operator when your dataset contains unstructured or semi-structured logs where information is embedded in free-form text, such as query strings, headers, or attributes. Instead of manually parsing substrings, `parse-kv` automatically splits the input into keys and values based on delimiters that you specify. +The `parse-kv` operator parses key-value pairs from a string field into individual columns. You use it when your data is stored in a single string that contains structured information, such as `key=value` pairs. With `parse-kv`, you can extract the values into separate columns to make them easier to query, filter, and analyze. -You find this operator useful in scenarios such as analyzing log entries with embedded metadata, extracting attributes from OpenTelemetry spans, or parsing security event messages where structured fields are embedded inside a single string. +This operator is useful in scenarios where logs, traces, or security events contain metadata encoded as key-value pairs. Instead of manually splitting strings, you can use `parse-kv` to transform the data into a structured format. ## For users of other query languages @@ -14,17 +14,19 @@ If you come from other query languages, this section explains how to adjust your -In Splunk SPL, you typically use the `kv` or `extract` command to pull key–value fields from raw events. In APL, `parse-kv` serves the same purpose but offers explicit control over delimiters and the ability to map extracted keys into new columns. +In Splunk, you often use the `kv` or `extract` commands to parse key-value pairs from raw log data. In APL, you achieve similar functionality with the `parse-kv` operator. The difference is that `parse-kv` explicitly lets you define which keys to extract and what delimiters to use. - ```sql Splunk example -... | kv pairdelim=" " kvdelim="=" +... | kv pairdelim=";" kvdelim="=" keys="key1,key2,key3" ```` ```kusto APL equivalent -['sample-http-logs'] -| extend parsed = parse-kv('uri', pair_delimiter='&', kv_delimiter='=') +datatable(data:string) +[ + 'key1=a;key2=b;key3=c' +] +| parse-kv data as (key1, key2, key3) with (pair_delimiter=';', kv_delimiter='=') ``` @@ -32,20 +34,23 @@ In Splunk SPL, you typically use the `kv` or `extract` command to pull key–val -ANSI SQL does not include a built-in function to parse arbitrary key–value pairs from text. You typically rely on `SUBSTRING`, `SPLIT`, or regular expressions. In APL, `parse-kv` encapsulates this functionality in one operator, making key–value extraction more concise and efficient. +ANSI SQL does not have a direct equivalent of `parse-kv`. Typically, you would use string functions such as `SUBSTRING` or `SPLIT_PART` to manually extract key-value pairs. In APL, `parse-kv` simplifies this process by automatically extracting multiple keys in one step. - ```sql SQL example SELECT - SUBSTRING_INDEX(SUBSTRING_INDEX(uri, '&', 1), '=', -1) AS key1, - SUBSTRING_INDEX(SUBSTRING_INDEX(uri, '&', 2), '=', -1) AS key2 -FROM logs + SUBSTRING_INDEX(SUBSTRING_INDEX(data, ';', 1), '=', -1) as key1, + SUBSTRING_INDEX(SUBSTRING_INDEX(data, ';', 2), '=', -1) as key2, + SUBSTRING_INDEX(SUBSTRING_INDEX(data, ';', 3), '=', -1) as key3 +FROM logs; ``` ```kusto APL equivalent -['sample-http-logs'] -| extend parsed = parse-kv('uri', pair_delimiter='&', kv_delimiter='=') +datatable(data:string) +[ + 'key1=a;key2=b;key3=c' +] +| parse-kv data as (key1, key2, key3) with (pair_delimiter=';', kv_delimiter='=') ``` @@ -58,103 +63,106 @@ FROM logs ### Syntax ```kusto -parse-kv(Expression [, pair_delimiter [, kv_delimiter [, quote_delimiter]]]) +parse-kv Expression as (KeysList) with (pair_delimiter = PairDelimiter, kv_delimiter = KvDelimiter [, options...]) ``` ### Parameters -| Parameter | Type | Description | -| ----------------- | -------- | ------------------------------------------------------------------------------------ | -| `Expression` | `string` | The string expression that contains key–value pairs. | -| `pair_delimiter` | `string` | Character that separates key–value pairs. Default is a space (`' '`). | -| `kv_delimiter` | `string` | Character that separates keys from values. Default is `'='`. | -| `quote_delimiter` | `string` | Character that surrounds values with spaces or special characters. Default is `'"'`. | +| Parameter | Description | +| ---------------- | ------------------------------------------------------------------------------- | +| `Expression` | The string expression that contains the key-value pairs. | +| `KeysList` | A list of keys to extract into separate columns. | +| `pair_delimiter` | A character or string that separates key-value pairs (for example, `;` or `,`). | +| `kv_delimiter` | A character or string that separates keys and values (for example, `=` or `:`). | +| `options` | Additional parsing options, such as case sensitivity. | ### Returns -A dynamic object containing the extracted key–value pairs. You can project individual keys as new columns or keep the object for further processing. +A dataset where each specified key is extracted into its own column with the corresponding value. If a key is missing in the original string, the column is empty for that row. ## Use case examples -You can extract query parameters from HTTP request URIs to analyze how users interact with your service. +When analyzing HTTP logs, you might encounter a field where request metadata is encoded as key-value pairs. You can extract values like status and duration for easier analysis. **Query** ```kusto ['sample-http-logs'] -| extend kv = parse-kv(uri, pair_delimiter='&', kv_delimiter='=') -| project _time, id, method, kv +| extend kvdata = strcat('status=', status, ';req_duration_ms=', tostring(req_duration_ms)) +| parse-kv kvdata as (status, req_duration_ms) with (pair_delimiter=';', kv_delimiter='=') +| project _time, status, req_duration_ms, method, uri ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | extend kv = parse-kv(uri, pair_delimiter='&', kv_delimiter='=') | project _time, id, method, kv%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5Bsample-http-logs%5D%20%7C%20extend%20kvdata%20%3D%20strcat%28status%3D%2C%20status%2C%20%3Breq_duration_ms%3D%2C%20tostring%28req_duration_ms%29%29%20%7C%20parse-kv%20kvdata%20as%20%28status%2C%20req_duration_ms%29%20with%20%28pair_delimiter%3D%27%3B%27%2C%20kv_delimiter%3D%27%3D%27%29%20%7C%20project%20_time%2C%20status%2C%20req_duration_ms%2C%20method%2C%20uri%22%7D) **Output** -| _time | id | method | kv | -| -------------------- | ---- | ------ | --------------------------------- | -| 2025-09-01T10:01:00Z | u123 | GET | {"q":"search","page":"1"} | -| 2025-09-01T10:02:00Z | u456 | POST | {"action":"login","next":"/home"} | +| _time | status | req_duration_ms | method | uri | +| -------------------- | ------ | ----------------- | ------ | -------- | +| 2024-05-01T10:00:00Z | 200 | 120 | GET | /home | +| 2024-05-01T10:01:00Z | 404 | 35 | GET | /missing | -This query parses the query string parameters in the `uri` field and projects them into a structured dynamic object. +This query extracts status and request duration from a concatenated field and projects them alongside other useful fields. -You can extract attributes embedded in trace span IDs or metadata. +OpenTelemetry traces often include attributes stored as key-value strings. You can use `parse-kv` to extract service name and status code for trace debugging. **Query** ```kusto ['otel-demo-traces'] -| extend kv = parse-kv(span_id, pair_delimiter='-', kv_delimiter=':') -| project _time, trace_id, ['service.name'], kv +| extend kvdata = strcat('service.name=', ['service.name'], ';status_code=', status_code) +| parse-kv kvdata as (['service.name'], status_code) with (pair_delimiter=';', kv_delimiter='=') +| project _time, trace_id, span_id, ['service.name'], status_code, duration ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | extend kv = parse-kv(span_id, pair_delimiter='-', kv_delimiter=':') | project _time, trace_id, ['service.name'], kv%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5Botel-demo-traces%5D%20%7C%20extend%20kvdata%20%3D%20strcat%28service.name%3D%2C%20%5Bservice.name%5D%2C%20%3Bstatus_code%3D%2C%20status_code%29%20%7C%20parse-kv%20kvdata%20as%20%28%5Bservice.name%5D%2C%20status_code%29%20with%20%28pair_delimiter%3D%27%3B%27%2C%20kv_delimiter%3D%27%3D%27%29%20%7C%20project%20_time%2C%20trace_id%2C%20span_id%2C%20%5Bservice.name%5D%2C%20status_code%2C%20duration%22%7D) **Output** -| _time | trace_id | service.name | kv | -| -------------------- | --------- | --------------- | ----------------------------- | -| 2025-09-01T10:03:00Z | t789 | frontend | {"part1":"abc","part2":"xyz"} | -| 2025-09-01T10:04:00Z | t101 | checkoutservice | {"part1":"def","part2":"uvw"} | +| _time | trace_id | span_id | service.name | status_code | duration | +| -------------------- | --------- | -------- | ------------ | ------------ | ------------ | +| 2024-05-01T11:00:00Z | abc123 | span01 | frontend | 200 | 00:00:00.150 | +| 2024-05-01T11:00:01Z | def456 | span02 | cartservice | 500 | 00:00:00.320 | -This query breaks down structured identifiers in `span_id` into key–value components for easier analysis. +This query extracts the service name and status code from a synthetic key-value string for easier analysis of trace health. -You can parse structured information inside log messages, such as extracting action and result fields. +Security logs sometimes encode user and location information as key-value pairs. You can extract fields like user ID and city for investigation. **Query** ```kusto ['sample-http-logs'] -| extend kv = parse-kv(uri, pair_delimiter=';', kv_delimiter='=') -| project _time, status, ['geo.country'], kv +| extend kvdata = strcat('id=', id, ';city=', ['geo.city']) +| parse-kv kvdata as (id, ['geo.city']) with (pair_delimiter=';', kv_delimiter='=') +| project _time, id, ['geo.city'], status, uri ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | extend kv = parse-kv(uri, pair_delimiter=';', kv_delimiter='=') | project _time, status, ['geo.country'], kv%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5Bsample-http-logs%5D%20%7C%20extend%20kvdata%20%3D%20strcat%28id%3D%2C%20id%2C%20%3Bcity%3D%2C%20%5Bgeo.city%5D%29%20%7C%20parse-kv%20kvdata%20as%20%28id%2C%20%5Bgeo.city%5D%29%20with%20%28pair_delimiter%3D%27%3B%27%2C%20kv_delimiter%3D%27%3D%27%29%20%7C%20project%20_time%2C%20id%2C%20%5Bgeo.city%5D%2C%20status%2C%20uri%22%7D) **Output** -| _time | status | geo.country | kv | -| -------------------- | ------ | ----------- | ------------------------------------- | -| 2025-09-01T10:05:00Z | 403 | US | {"action":"access","result":"denied"} | -| 2025-09-01T10:06:00Z | 200 | DE | {"action":"login","result":"success"} | +| _time | id | geo.city | status | uri | +| -------------------- | ------- | -------- | ------ | ------ | +| 2024-05-01T12:00:00Z | user123 | Berlin | 200 | /login | +| 2024-05-01T12:01:00Z | user456 | Paris | 403 | /admin | -This query parses embedded key–value fields from the `uri` to identify blocked and successful login attempts by country. +This query extracts user ID and city information from a synthetic key-value string to help detect suspicious activity by location. ## List of related operators -- [parse-json](/apl/scalar-functions/parse-json-function): Use when the string is already in JSON format. More efficient than `parse-kv` for JSON data. -- [extract](/apl/scalar-functions/extract-function): Use for regular expression-based extraction when data is less structured. -- [split](/apl/scalar-functions/split-function): Use when you need to split strings into arrays instead of key–value objects. -- [mv-expand](/apl/tabular-operators/mv-expand-operator): Use to expand dynamic arrays or property bags into separate rows after parsing. -- [project](/apl/tabular-operators/project-operator): Use to select and rename specific extracted keys for further analysis. +- [extend](/apl/tabular-operators/extend-operator): Adds calculated columns. Use when parsing is not required but you want to create new derived columns. +- [parse](/apl/tabular-operators/parse-operator): Extracts values from a string expression without filtering out non-matching rows. Use when you want to keep all rows, including those that fail to parse. +- [project](/apl/tabular-operators/project-operator): Selects and computes columns without parsing. Use when you want to transform data rather than extract values. +- [where](/apl/tabular-operators/where-operator): Filters rows based on conditions. Use alongside parsing functions if you want more control over filtering logic. diff --git a/apl/tabular-operators/parse-where.mdx b/apl/tabular-operators/parse-where.mdx index bd4e4880..fcb11d81 100644 --- a/apl/tabular-operators/parse-where.mdx +++ b/apl/tabular-operators/parse-where.mdx @@ -3,9 +3,9 @@ title: parse-where description: 'This page explains how to use the parse-where operator in APL.' --- -The `parse-where` operator lets you extract structured values from a text expression and filter rows where the parsing succeeds. You use it when your data contains unstructured or semi-structured text, and you want to match and filter rows based on whether they fit a specific format. Unlike `parse`, which extracts fields into new columns, `parse-where` focuses on filtering rows without retaining the parsed values. +The `parse-where` operator lets you extract values from a string expression based on a pattern and at the same time filter out rows that don’t match the pattern. This operator is useful when you want to ensure that your results contain only rows where the parsing succeeds, reducing the need for an additional filtering step. -This operator is useful when you work with logs, traces, or event data that contain patterns such as HTTP requests, error messages, or identifiers. By applying `parse-where`, you can reduce large datasets to only the rows that match your format of interest. +You can use `parse-where` when working with logs or event data that follow a known structure but may contain noise or irrelevant lines. For example, you can parse request logs to extract structured information like HTTP method, status code, or error messages, and automatically discard any rows that don’t match the format. ## For users of other query languages @@ -14,16 +14,20 @@ If you come from other query languages, this section explains how to adjust your -In Splunk SPL, you use the `rex` command to extract fields from raw text and filter events. In APL, the `parse-where` operator serves a similar role, but instead of extracting values, it filters rows to those where parsing succeeds. You don’t need to keep the extracted fields if you only want to filter. +In Splunk SPL, you use the `rex` command with the `max_match=1` option to extract fields and filter out non-matching events. In APL, `parse-where` provides the same functionality in a more direct way. Rows that do not match the pattern are automatically excluded. ```sql Splunk example -... | rex field=_raw "uri=(?/api/[a-z]+)" +... | rex field=log_line "\[(?\w+)\] (?.+)" ```` ```kusto APL equivalent -['sample-http-logs'] -| parse-where uri with '/api/' endpoint +datatable(log_line:string) +[ + '[INFO] Service started', + 'invalid line' +] +| parse-where log_line with '[', level:string, '] ', message:string ``` @@ -31,18 +35,24 @@ In Splunk SPL, you use the `rex` command to extract fields from raw text and fil -In SQL, you often combine string functions with a `WHERE` clause to test patterns, for example with `LIKE` or `REGEXP`. In APL, you can achieve similar filtering with `parse-where`, which only keeps rows where the pattern match succeeds. Unlike SQL, you don’t have to write separate conditions—the operator handles both parsing and filtering. +ANSI SQL does not have a direct equivalent to `parse-where`. You often use `LIKE` or `REGEXP` functions to test string patterns and then combine them with `CASE` expressions to extract substrings. In APL, `parse-where` simplifies this by combining extraction and filtering into one operator. ```sql SQL example -SELECT * -FROM sample_http_logs -WHERE uri LIKE '/api/%' +SELECT + REGEXP_SUBSTR(log_line, '\\[(\\w+)\\]', 1, 1) AS level, + REGEXP_SUBSTR(log_line, '\\] (.+)', 1, 1) AS message +FROM logs +WHERE log_line REGEXP '\\[(\\w+)\\] (.+)'; ``` ```kusto APL equivalent -['sample-http-logs'] -| parse-where uri with '/api/' endpoint +datatable(log_line:string) +[ + '[INFO] Service started', + 'invalid line' +] +| parse-where log_line with '[', level:string, '] ', message:string ``` @@ -55,102 +65,106 @@ WHERE uri LIKE '/api/%' ### Syntax ```kusto -T | parse-where Expression with * stringConstant columnName * [stringConstant columnName ...] +parse-where [kind=kind [flags=regexFlags]] expression with [*] stringConstant columnName [:columnType] [*] ``` ### Parameters -| Parameter | Type | Description | -| ---------------- | ------ | -------------------------------------------------- | -| `Expression` | string | The text expression to parse. | -| `stringConstant` | string | A fixed text value that defines the parse pattern. | -| `columnName` | string | A placeholder for the parsed value in the pattern. | +| Parameter | Description | +| ---------------- | ------------------------------------------------------------------------------------------------------------------------------ | +| `kind` | (Optional) Specifies the parsing method. The default is `simple`. You can also specify `regex` for regular expression parsing. | +| `flags` | (Optional) Regex flags to control the behavior of pattern matching. Used only with `kind=regex`. | +| `expression` | The string expression to parse. | +| `stringConstant` | The constant parts of the pattern that must match exactly. | +| `columnName` | The name of the column where the extracted value is stored. | +| `columnType` | (Optional) The type of the extracted value (for example, `string`, `int`, `real`). | ### Returns -The operator returns only those rows where the expression matches the specified parse pattern. It does not keep or project the parsed values. +The operator returns a table with the original columns and the newly extracted columns. Rows that do not match the parsing pattern are removed. ## Use case examples -When you analyze HTTP logs, you can use `parse-where` to filter only the rows where a URL matches a specific pattern. +You want to extract the HTTP method and status code from request logs while ignoring rows that do not follow the expected format. **Query** ```kusto ['sample-http-logs'] -| parse-where uri with '/api/' endpoint -| project _time, id, method, uri, status +| parse-where uri with '/' method:string '/' * 'status=' status:string +| project _time, method, status, uri ``` -[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20parse-where%20uri%20with%20'/api/'%20endpoint%20%7C%20project%20_time%2C%20id%2C%20method%2C%20uri%2C%20status%22%7D) + **Output** -| _time | id | method | uri | status | -| -------------------- | ------ | ------ | ------------- | ------ | -| 2025-01-01T12:00:00Z | user42 | GET | /api/products | 200 | -| 2025-01-01T12:01:00Z | user17 | POST | /api/cart | 201 | +| \_time | method | status | uri | +| -------------------- | ------ | ------ | --------------------------- | +| 2025-09-01T12:00:00Z | GET | 200 | /GET/api/items?status=200 | +| 2025-09-01T12:00:05Z | POST | 500 | /POST/api/orders?status=500 | -This query keeps only log entries where the `uri` field starts with `/api/`. +This query extracts the method and status from the `uri` field and discards rows where the `uri` does not match the pattern. -In traces, you can use `parse-where` to filter spans whose service name follows a particular prefix. +You want to extract the service name and status code from traces, ignoring any spans that do not contain both. **Query** ```kusto ['otel-demo-traces'] -| parse-where ['service.name'] with 'frontend' suffix -| project _time, trace_id, span_id, ['service.name'], duration +| parse-where ['service.name'] with service:string +| parse-where status_code with status:string +| project _time, trace_id, span_id, service, status ``` -[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20parse-where%20%5B'service.name'%5D%20with%20'frontend'%20suffix%20%7C%20project%20_time%2C%20trace_id%2C%20span_id%2C%20%5B'service.name'%5D%2C%20duration%22%7D) + **Output** -| _time | trace_id | span_id | service.name | duration | -| -------------------- | --------- | -------- | ------------- | ---------------- | -| 2025-01-01T12:00:00Z | abc123 | span01 | frontend | 00:00:01.2000000 | -| 2025-01-01T12:00:01Z | def456 | span02 | frontendproxy | 00:00:00.5000000 | +| _time | trace_id | span_id | service | status | +| -------------------- | --------- | -------- | --------------- | ------ | +| 2025-09-01T13:00:00Z | abc123 | span01 | frontend | 200 | +| 2025-09-01T13:00:01Z | def456 | span02 | checkoutservice | 500 | -This query filters spans where the `service.name` starts with `frontend`. +This query ensures that only spans with a valid service name and status code are included. -In security analysis, you can filter logs to only those where the request URI matches a suspicious pattern. +You want to extract the user ID and HTTP status from logs to identify failed login attempts. **Query** ```kusto ['sample-http-logs'] -| parse-where uri with '/admin/' path -| project _time, id, method, uri, status, ['geo.country'] +| parse-where uri with '/login?id=' id:string '&status=' status:string +| where status != '200' +| project _time, id, status, uri ``` -[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20parse-where%20uri%20with%20'/admin/'%20path%20%7C%20project%20_time%2C%20id%2C%20method%2C%20uri%2C%20status%2C%20%5B'geo.country'%5D%22%7D) + **Output** -| _time | id | method | uri | status | geo.country | -| -------------------- | ------ | ------ | ------------- | ------ | ----------- | -| 2025-01-01T13:00:00Z | user99 | GET | /admin/login | 403 | US | -| 2025-01-01T13:05:00Z | user22 | POST | /admin/config | 500 | DE | +| _time | id | status | uri | +| -------------------- | ------- | ------ | ----------------------------- | +| 2025-09-01T14:00:00Z | user123 | 401 | /login?id=user123\&status=401 | +| 2025-09-01T14:05:00Z | user456 | 403 | /login?id=user456\&status=403 | -This query keeps only suspicious requests that access admin endpoints. +This query extracts user IDs and statuses from login attempts and filters out successful logins. ## List of related operators -- [parse](/apl/tabular-operators/parse-operator): Extracts structured data into columns instead of filtering rows. Use it when you want to keep the parsed values. -- [extract](/apl/scalar-functions/extract-function): Extracts data from text using regular expressions. Use it when you need flexible pattern matching and want to capture values. -- [where](/apl/tabular-operators/where-operator): Filters rows based on Boolean expressions. Use it for direct conditions instead of pattern parsing. -- [search](/apl/tabular-operators/search-operator): Searches across multiple columns for specific text. Use it for broader keyword searches. -- [project](/apl/tabular-operators/project-operator): Selects and computes specific columns. Use it after filtering to refine the result set. +- [extend](/apl/tabular-operators/extend-operator): Adds calculated columns. Use when parsing is not required but you want to create new derived columns. +- [parse](/apl/tabular-operators/parse-operator): Extracts values from a string expression without filtering out non-matching rows. Use when you want to keep all rows, including those that fail to parse. +- [project](/apl/tabular-operators/project-operator): Selects and computes columns without parsing. Use when you want to transform data rather than extract values. +- [where](/apl/tabular-operators/where-operator): Filters rows based on conditions. Use alongside parsing functions if you want more control over filtering logic. diff --git a/apl/tabular-operators/project-rename.mdx b/apl/tabular-operators/project-rename.mdx index 9a90f1e1..d1e4251c 100644 --- a/apl/tabular-operators/project-rename.mdx +++ b/apl/tabular-operators/project-rename.mdx @@ -86,7 +86,7 @@ When analyzing HTTP logs, you might want to rename fields to shorter or more des | project-rename city = ['geo.city'], country = ['geo.country'] ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | project-rename city = ['geo.city'], country = ['geo.country']%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20project-rename%20city%20%3D%20%5B'geo.city'%5D%2C%20country%20%3D%20%5B'geo.country'%5D%22%7D) **Output** @@ -106,19 +106,19 @@ When inspecting distributed traces, you can rename service-related fields to mat ```kusto ['otel-demo-traces'] -| project-rename service = ['service.name'], code = status_code +| project-rename service = ['service.name'] ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | project-rename service = ['service.name'], code = status_code%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20project-rename%20service%20%3D%20%5B'service.name'%5D%22%7D) **Output** -| _time | duration | span_id | trace_id | service | kind | code | -| -------------------- | ------------ | -------- | --------- | --------------- | ------ | ---- | -| 2025-09-01T09:55:00Z | 00:00:01.200 | abc123 | trace789 | frontend | server | 200 | -| 2025-09-01T09:56:00Z | 00:00:00.450 | def456 | trace790 | checkoutservice | client | 500 | +| _time | duration | span_id | trace_id | service | kind | +| -------------------- | ------------ | -------- | --------- | --------------- | ------ | +| 2025-09-01T09:55:00Z | 00:00:01.200 | abc123 | trace789 | frontend | server | +| 2025-09-01T09:56:00Z | 00:00:00.450 | def456 | trace790 | checkout | client | -This query renames `service.name` to `service` and `status_code` to `code`, making them shorter for downstream filtering. +This query renames `service.name` to `service`, making it shorter for downstream filtering. @@ -132,11 +132,11 @@ For security-related HTTP log analysis, you can rename status and URI fields to | project-rename http_status = status, url = uri ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | project-rename http_status = status, url = uri%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20project-rename%20http_status%20%3D%20status%2C%20url%20%3D%20uri%22%7D) **Output** -| _time | req_duration_ms | id | http_status | url | method | ['geo.city'] | ['geo.country'] | +| _time | req_duration_ms | id | http_status | url | method | geo.city | geo.country | | -------------------- | ----------------- | ----- | ------------ | ------ | ------ | ------------- | ---------------- | | 2025-09-01T11:00:00Z | 150 | user5 | 403 | /admin | POST | Madrid | ES | | 2025-09-01T11:02:00Z | 200 | user6 | 500 | /login | POST | Rome | IT | @@ -148,8 +148,7 @@ This query renames `status` to `http_status` and `uri` to `url`, making the outp ## List of related operators -- [project](/apl/tabular-operators/project-operator): Lets you select and rename columns at the same time. Use it when you want to control which columns appear in the result. - [extend](/apl/tabular-operators/extend-operator): Creates new calculated columns. Use it when you want to add columns rather than rename existing ones. +- [project](/apl/tabular-operators/project-operator): Lets you select and rename columns at the same time. Use it when you want to control which columns appear in the result. - [project-away](/apl/tabular-operators/project-away-operator): Removes specific columns from the dataset. Use it when you want to drop columns rather than rename them. -- [rename](/apl/tabular-operators/rename-operator): Renames columns without projecting. Similar to `project-rename`, but does not allow simultaneous projection. Use `project-rename` when you want flexible renaming in the projection pipeline. - [summarize](/apl/tabular-operators/summarize-operator): Aggregates data into groups. Use it when you want to compute metrics rather than adjust column names. From 391e61264ea3aa2a7d3a9fc8d9aa0dd459447d7a Mon Sep 17 00:00:00 2001 From: Mano Toth Date: Tue, 2 Sep 2025 13:04:52 +0200 Subject: [PATCH 3/6] Fixes --- apl/tabular-operators/parse-kv.mdx | 8 ++-- apl/tabular-operators/parse-where.mdx | 66 ++------------------------- 2 files changed, 8 insertions(+), 66 deletions(-) diff --git a/apl/tabular-operators/parse-kv.mdx b/apl/tabular-operators/parse-kv.mdx index ec048016..bf2d1d84 100644 --- a/apl/tabular-operators/parse-kv.mdx +++ b/apl/tabular-operators/parse-kv.mdx @@ -63,7 +63,7 @@ datatable(data:string) ### Syntax ```kusto -parse-kv Expression as (KeysList) with (pair_delimiter = PairDelimiter, kv_delimiter = KvDelimiter [, options...]) +parse-kv Expression as (KeysList) with (pair_delimiter = PairDelimiter, kv_delimiter = KvDelimiter [, Options...]) ``` ### Parameters @@ -72,9 +72,9 @@ parse-kv Expression as (KeysList) with (pair_delimiter = PairDelimiter, kv_delim | ---------------- | ------------------------------------------------------------------------------- | | `Expression` | The string expression that contains the key-value pairs. | | `KeysList` | A list of keys to extract into separate columns. | -| `pair_delimiter` | A character or string that separates key-value pairs (for example, `;` or `,`). | -| `kv_delimiter` | A character or string that separates keys and values (for example, `=` or `:`). | -| `options` | Additional parsing options, such as case sensitivity. | +| `PairDelimiter` | A character or string that separates key-value pairs (for example, `;` or `,`). | +| `KvDelimiter` | A character or string that separates keys and values (for example, `=` or `:`). | +| `Options` | Additional parsing options, such as case sensitivity. | ### Returns diff --git a/apl/tabular-operators/parse-where.mdx b/apl/tabular-operators/parse-where.mdx index fcb11d81..120d4879 100644 --- a/apl/tabular-operators/parse-where.mdx +++ b/apl/tabular-operators/parse-where.mdx @@ -83,85 +83,27 @@ parse-where [kind=kind [flags=regexFlags]] expression with [*] stringConstant co The operator returns a table with the original columns and the newly extracted columns. Rows that do not match the parsing pattern are removed. -## Use case examples +## Use case example - - - -You want to extract the HTTP method and status code from request logs while ignoring rows that do not follow the expected format. +You want to extract the HTTP method and status code from request logs while ignoring rows that don’t follow the expected format. **Query** ```kusto -['sample-http-logs'] +['http-logs'] | parse-where uri with '/' method:string '/' * 'status=' status:string | project _time, method, status, uri ``` - - **Output** -| \_time | method | status | uri | +| _time | method | status | uri | | -------------------- | ------ | ------ | --------------------------- | | 2025-09-01T12:00:00Z | GET | 200 | /GET/api/items?status=200 | | 2025-09-01T12:00:05Z | POST | 500 | /POST/api/orders?status=500 | This query extracts the method and status from the `uri` field and discards rows where the `uri` does not match the pattern. - - - -You want to extract the service name and status code from traces, ignoring any spans that do not contain both. - -**Query** - -```kusto -['otel-demo-traces'] -| parse-where ['service.name'] with service:string -| parse-where status_code with status:string -| project _time, trace_id, span_id, service, status -``` - - - -**Output** - -| _time | trace_id | span_id | service | status | -| -------------------- | --------- | -------- | --------------- | ------ | -| 2025-09-01T13:00:00Z | abc123 | span01 | frontend | 200 | -| 2025-09-01T13:00:01Z | def456 | span02 | checkoutservice | 500 | - -This query ensures that only spans with a valid service name and status code are included. - - - - -You want to extract the user ID and HTTP status from logs to identify failed login attempts. - -**Query** - -```kusto -['sample-http-logs'] -| parse-where uri with '/login?id=' id:string '&status=' status:string -| where status != '200' -| project _time, id, status, uri -``` - - - -**Output** - -| _time | id | status | uri | -| -------------------- | ------- | ------ | ----------------------------- | -| 2025-09-01T14:00:00Z | user123 | 401 | /login?id=user123\&status=401 | -| 2025-09-01T14:05:00Z | user456 | 403 | /login?id=user456\&status=403 | - -This query extracts user IDs and statuses from login attempts and filters out successful logins. - - - - ## List of related operators - [extend](/apl/tabular-operators/extend-operator): Adds calculated columns. Use when parsing is not required but you want to create new derived columns. From 15a53809b297c575d60392803baaf7b0fe91b739 Mon Sep 17 00:00:00 2001 From: Mano Toth Date: Tue, 2 Sep 2025 13:07:45 +0200 Subject: [PATCH 4/6] Fixes --- apl/tabular-operators/parse-kv.mdx | 61 +----------------------------- 1 file changed, 1 insertion(+), 60 deletions(-) diff --git a/apl/tabular-operators/parse-kv.mdx b/apl/tabular-operators/parse-kv.mdx index bf2d1d84..61f6e0e6 100644 --- a/apl/tabular-operators/parse-kv.mdx +++ b/apl/tabular-operators/parse-kv.mdx @@ -80,10 +80,7 @@ parse-kv Expression as (KeysList) with (pair_delimiter = PairDelimiter, kv_delim A dataset where each specified key is extracted into its own column with the corresponding value. If a key is missing in the original string, the column is empty for that row. -## Use case examples - - - +## Use case example When analyzing HTTP logs, you might encounter a field where request metadata is encoded as key-value pairs. You can extract values like status and duration for easier analysis. @@ -91,13 +88,10 @@ When analyzing HTTP logs, you might encounter a field where request metadata is ```kusto ['sample-http-logs'] -| extend kvdata = strcat('status=', status, ';req_duration_ms=', tostring(req_duration_ms)) | parse-kv kvdata as (status, req_duration_ms) with (pair_delimiter=';', kv_delimiter='=') | project _time, status, req_duration_ms, method, uri ``` -[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5Bsample-http-logs%5D%20%7C%20extend%20kvdata%20%3D%20strcat%28status%3D%2C%20status%2C%20%3Breq_duration_ms%3D%2C%20tostring%28req_duration_ms%29%29%20%7C%20parse-kv%20kvdata%20as%20%28status%2C%20req_duration_ms%29%20with%20%28pair_delimiter%3D%27%3B%27%2C%20kv_delimiter%3D%27%3D%27%29%20%7C%20project%20_time%2C%20status%2C%20req_duration_ms%2C%20method%2C%20uri%22%7D) - **Output** | _time | status | req_duration_ms | method | uri | @@ -107,59 +101,6 @@ When analyzing HTTP logs, you might encounter a field where request metadata is This query extracts status and request duration from a concatenated field and projects them alongside other useful fields. - - - -OpenTelemetry traces often include attributes stored as key-value strings. You can use `parse-kv` to extract service name and status code for trace debugging. - -**Query** - -```kusto -['otel-demo-traces'] -| extend kvdata = strcat('service.name=', ['service.name'], ';status_code=', status_code) -| parse-kv kvdata as (['service.name'], status_code) with (pair_delimiter=';', kv_delimiter='=') -| project _time, trace_id, span_id, ['service.name'], status_code, duration -``` - -[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5Botel-demo-traces%5D%20%7C%20extend%20kvdata%20%3D%20strcat%28service.name%3D%2C%20%5Bservice.name%5D%2C%20%3Bstatus_code%3D%2C%20status_code%29%20%7C%20parse-kv%20kvdata%20as%20%28%5Bservice.name%5D%2C%20status_code%29%20with%20%28pair_delimiter%3D%27%3B%27%2C%20kv_delimiter%3D%27%3D%27%29%20%7C%20project%20_time%2C%20trace_id%2C%20span_id%2C%20%5Bservice.name%5D%2C%20status_code%2C%20duration%22%7D) - -**Output** - -| _time | trace_id | span_id | service.name | status_code | duration | -| -------------------- | --------- | -------- | ------------ | ------------ | ------------ | -| 2024-05-01T11:00:00Z | abc123 | span01 | frontend | 200 | 00:00:00.150 | -| 2024-05-01T11:00:01Z | def456 | span02 | cartservice | 500 | 00:00:00.320 | - -This query extracts the service name and status code from a synthetic key-value string for easier analysis of trace health. - - - - -Security logs sometimes encode user and location information as key-value pairs. You can extract fields like user ID and city for investigation. - -**Query** - -```kusto -['sample-http-logs'] -| extend kvdata = strcat('id=', id, ';city=', ['geo.city']) -| parse-kv kvdata as (id, ['geo.city']) with (pair_delimiter=';', kv_delimiter='=') -| project _time, id, ['geo.city'], status, uri -``` - -[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5Bsample-http-logs%5D%20%7C%20extend%20kvdata%20%3D%20strcat%28id%3D%2C%20id%2C%20%3Bcity%3D%2C%20%5Bgeo.city%5D%29%20%7C%20parse-kv%20kvdata%20as%20%28id%2C%20%5Bgeo.city%5D%29%20with%20%28pair_delimiter%3D%27%3B%27%2C%20kv_delimiter%3D%27%3D%27%29%20%7C%20project%20_time%2C%20id%2C%20%5Bgeo.city%5D%2C%20status%2C%20uri%22%7D) - -**Output** - -| _time | id | geo.city | status | uri | -| -------------------- | ------- | -------- | ------ | ------ | -| 2024-05-01T12:00:00Z | user123 | Berlin | 200 | /login | -| 2024-05-01T12:01:00Z | user456 | Paris | 403 | /admin | - -This query extracts user ID and city information from a synthetic key-value string to help detect suspicious activity by location. - - - - ## List of related operators - [extend](/apl/tabular-operators/extend-operator): Adds calculated columns. Use when parsing is not required but you want to create new derived columns. From 55083ec53aeec52de5ea1868d8e6799503eb8ae3 Mon Sep 17 00:00:00 2001 From: Mano Toth Date: Wed, 3 Sep 2025 11:16:56 +0200 Subject: [PATCH 5/6] Update mv-expand.mdx --- apl/tabular-operators/mv-expand.mdx | 91 ++++------------------------- 1 file changed, 11 insertions(+), 80 deletions(-) diff --git a/apl/tabular-operators/mv-expand.mdx b/apl/tabular-operators/mv-expand.mdx index 4cf5a907..3807b314 100644 --- a/apl/tabular-operators/mv-expand.mdx +++ b/apl/tabular-operators/mv-expand.mdx @@ -58,12 +58,6 @@ CROSS JOIN UNNEST(request_uris) AS t(value) mv-expand [kind=(bag|array)] [with_itemindex=IndexFieldName] FieldName [to typeof(Typename)] [limit Rowlimit] ``` -### Example - -```kusto -mv-expand kind=array tags to typeof(string) limit 1000 -``` - ### Parameters | Parameter | Description | @@ -78,93 +72,30 @@ mv-expand kind=array tags to typeof(string) limit 1000 The operator returns a table where each element of the expanded array or each property of the expanded object is placed in its own row. Other columns are duplicated for each expanded row. -## Use case examples +## Use case example - - - -When analyzing logs, request URIs can sometimes be stored as arrays. You can use `mv-expand` to expand them into individual rows for easier filtering. +When analyzing logs, some values can be stored as arrays. You can use `mv-expand` to expand them into individual rows for easier filtering. **Query** ```kusto ['sample-http-logs'] -| mv-expand uri -| summarize count() by uri -| top 5 by count_ +| limit 100 +| mv-expand territories +| summarize count = count() by territory_name = tostring(territories) ``` -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | mv-expand uri | summarize count() by uri | top 5 by count_%22%7D) +[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20limit%20100%20%7C%20mv-expand%20territories%20%7C%20summarize%20count%20%3D%20count()%20by%20territory_name%20%3D%20tostring(territories)%22%7D) **Output** -| uri | count_ | +| territory_name | count | | ---------------- | ------- | -| /api/v1/products | 1200 | -| /api/v1/cart | 950 | -| /api/v1/checkout | 720 | -| /api/v1/login | 650 | -| /api/v1/profile | 500 | - -This query expands the `uri` array into rows and counts the most frequent request URIs. - - - - -Traces often include multiple span IDs grouped under a single trace. You can use `mv-expand` to expand span IDs for detailed analysis. - -**Query** - -```kusto -['otel-demo-traces'] -| mv-expand span_id -| summarize count() by ['service.name'] -``` - -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['otel-demo-traces)'] | mv-expand span_id | summarize count() by ['service.name']%22%7D) - -**Output** - -| service.name | count_ | -| --------------------- | ------- | -| frontend | 4100 | -| cartservice | 3800 | -| checkoutservice | 3600 | -| productcatalogservice | 3400 | -| loadgenerator | 3000 | - -This query expands the `span_id` field to count how many spans each service generates. - - - - -In security logs, user IDs can appear as arrays if multiple accounts are affected by a single event. `mv-expand` helps isolate them for per-user inspection. - -**Query** - -```kusto -['sample-http-logs'] -| mv-expand id -| summarize count() by id, status -| top 5 by count_ -``` - -[Run in Playground]([https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22['sample-http-logs)'] | mv-expand id | summarize count() by id, status | top 5 by count_%22%7D) - -**Output** - -| id | status | count_ | -| ------- | ------ | ------- | -| user123 | 401 | 320 | -| user456 | 403 | 250 | -| user789 | 200 | 220 | -| user111 | 500 | 180 | -| user222 | 404 | 150 | - -This query expands the `id` array into rows and counts how often each user ID appears with a given status. +| United States | 67 | +| India | 22 | +| Japan | 12 | - - +This query expands the `territories` array into rows and counts the most frequent territories. ## List of related operators From 967c0f5d6a61d50e04eba2a1b2dc5fb4ab51a182 Mon Sep 17 00:00:00 2001 From: Mano Toth Date: Wed, 3 Sep 2025 11:20:18 +0200 Subject: [PATCH 6/6] Fixes --- apl/apl-features.mdx | 1 + apl/tabular-operators/overview.mdx | 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/apl/apl-features.mdx b/apl/apl-features.mdx index df194cae..150962fe 100644 --- a/apl/apl-features.mdx +++ b/apl/apl-features.mdx @@ -288,6 +288,7 @@ keywords: ['axiom documentation', 'documentation', 'axiom', 'APL', 'axiom proces | Tabular operator | [parse-where](/apl/tabular-operators/parse-where) | Returns a dataset where values from a string are extracted based on a pattern. | | Tabular operator | [project-away](/apl/tabular-operators/project-away-operator) | Returns the input dataset excluding the specified fields. | | Tabular operator | [project-keep](/apl/tabular-operators/project-keep-operator) | Returns a dataset with only the specified fields. | +| Tabular operator | [project-rename](/apl/tabular-operators/project-rename) | Returns a dataset where the specified field is renamed according to the specified pattern. | | Tabular operator | [project-reorder](/apl/tabular-operators/project-reorder-operator) | Returns a table with the specified fields reordered as requested followed by any unspecified fields in their original order. | | Tabular operator | [project](/apl/tabular-operators/project-operator) | Returns a dataset containing only the specified fields. | | Tabular operator | [redact](/apl/tabular-operators/redact-operator) | Returns the input dataset with sensitive data replaced or hashed. | diff --git a/apl/tabular-operators/overview.mdx b/apl/tabular-operators/overview.mdx index 3fa6e298..f8774625 100644 --- a/apl/tabular-operators/overview.mdx +++ b/apl/tabular-operators/overview.mdx @@ -38,4 +38,3 @@ The table summarizes the tabular operators available in APL. | [top](/apl/tabular-operators/top-operator) | Returns the top N rows from the dataset based on the specified sorting criteria. | | [union](/apl/tabular-operators/union-operator) | Returns all rows from the specified tables or queries. | | [where](/apl/tabular-operators/where-operator) | Returns a filtered dataset containing only the rows where the condition evaluates to true. | -