01.-What is a data series? (01:29)
07.-Functions of the stats Command (06:56)
08.-Transforming commands Summary (01:18)
10.-Functions of the Eval Command (04:24)
11.-Eval as a Function (01:06)
A data series is a sequence of related data points that are plotted in a visualization:
- Single Series: Compares values of a simple data category.
- Multi-Series: Compares values of two or more data categories.
- Time-Series: Compares values over time, which can be either single or multi-series
Transforming commands can be used in searches to organize your results into a statistical table containing a data series that can be visualized. (Chart, timechart, top, rare, stats) We will learn how transforming commands can be used to structure searches to generate the results you need for the visualization you want.
Take results and return them formatted into a table that can be displayed as a visualization.
- Y-axis is defined with
stats-func(field)
. wherestats-func
is a supported statistical function:
-
field
is a field with numeric values. -
over row-spli
specifies the X-axis and defines the first column in our resulting table. Creates a single series. -
by column-split
Further split data, resulting in multi-series data-series. -
Further control of our results.
- span (with categorical fields). By default, Splunk will display individual columns for the top 10 values found in the field used to execute multi-series split (the field after
by
) - span (with numerical fields). will group the event into buckets. Splunk Shift overlapped values to the higher grouping. 400 falls into 400-500, instead of 300-400.
- limit: overrides that top 10 to whatever whole value you find appropriate. Less frequent values than those indicated in
limit
argument get aggregated into theother
column. ´limitargument sets a limit across the entire dataset.
limit=0` means no limit at all. When there are too many values to display inside the legend, the list will include a down arrow to scroll through the values. - useother=True/False: Visually Removes the
OTHER
column. There is no recalculation or research - usenull: Removes the
NULL
Column if one exists. it is for events that do not contain the field used to create the multi-series series
- span (with categorical fields). By default, Splunk will display individual columns for the top 10 values found in the field used to execute multi-series split (the field after
Each time you invoke the stats command, you can use one or more functions. However, you can only use one BY
clause.
unlike the stats
command, the chart
command can only be split over two fields or dimensions. A chart command with two arguments after the by
clause is equivalent to using an over and by clause. The values for a second split are represented by individually colored columns.
Some examples:
Performs stats aggregations against time and returns a time series chart or table where _time field is always the X-axis.
stats-func(field)
populates the Y-axis. count
is the only function that does not require field specification.
Function and argument used in Stat
and chart
can also be used with timechart
.
by <split-by-field>
spit our result table. A key difference between chart
and timechart
is that timechart
only supports a single additional split. This is because the X-axis is automatically segmented or bucketed based on time. Each distinct value of the split-by-field
will become a series.
The functional equivalent of the search, using chart
would be chart count by time and usage, but timechart
automatically applies a bucket command to set the time span to a preset sampling interval that depends on the time range of the search. We can see this reflected in the stats table output. Each row represents a chunk or bucket of aggregated data
Time range | Default time bucket |
---|---|
last 30 days | 1 day |
last 7 days | 1 day |
last 24 hours | 30 minutes |
last hour | 1 minute |
last 15 minutes | 10 secodns |
When the period Spluck uses is not appropriate, you can override it using the span
argument, which forces SPlunk to group bucket on the best-fit time range.
limit
argument controls the number of values returned for our multi-series split. Without it, we get the top 10 values in 11 lines in our chart, which has an additional other
series.
compared with
Automatically aggregates count in one-day buckets because the time range is 7 days.
When running a multi-series time chart, we have an option for how we want our data to be displayed. As our timechart can sometimes appear more cluttered, we have the option to toggle a feature called multi-series mode. This option shows in format
General
Multi-series mode
. This will separate out the different series into their own trendline sharing the same X-axis, but with individual Y-axis that share the floor and ceiling values.
Finds the most common values from a given list of fields in a result set. We can group results together based on a shared field with the by
clause.
By default output top 10 results in table format. This can be overridden with the limit argument
countfield': Renames the count field specifying a string. 'showperc', Defaults to True.
showperf=for
showperc=0` prevents the percentage column.
Which IP addresses generated the most attacks in the last 60 minutes? Without any argument, we get the count
or number of events and the percentage.
With only one field: 10 most common values in the Grout field. Column values are unique.
With two fields. All combinations of group and name are unique
Note that when you use this method the limit argument is set by default to 20
Essentially, is the opposite of the top command, which returns the least common values of a result set.. Has the same options., By default, results are sorted in ascending order based on count.
We produce statistics from our search results with the stats command. The output is a table. by <field list>
clause groups the result for each different value of each field in the field list. Differently from chart
and timechart
, stats
allows continuous splitting of your data. (timechart
splits only by one field. chart
command splits by two fields).
Use as
clause to rename the resulting column to override default column names according to search syntax. It is a very convenient option to avoid confusion when statistics for several fields are calculated.
in stats
statistical functions can support multiple fields
count
differs from count(field)
in that the former counts all events and the latter counts only events with a value in the field.
The order of fields in by <field list>
has a big impact on the search results as the data will first be grouped by the first field given, then grouped by the second field given, and so on.
There are four categories of statistical functions:
-
Aggregate: Summarizes event values to create a single value
count
,count(x)
,dc(x) or distinct_count(x)
,estdc(x)
, andestdc_error(x)
estimated count of the distinct values in the field specified.min(x)
,max(x)
,range(x)
.sum(x)
,sumsq(x)
. Sum and sum squaresavg(x)
,median(x)
,mode(x)
. Average ignores events without an specific value or without numeric values in the fieldstdev(x)
,stdevp(x)
,var(x)´,
varp(x)´.percentile<percentile>(x) or p<percentile>(x)
,upperperc<percentile>(x)
,exactperc<percentile>(x)
-
Event Order: Returns values from fields based on processing order.
- The
first(x)
seen value in the field x. - The
lastst(x)
seen value in the field x.
- The
-
Multivalue: Returns a list of values for a field.
-
Time: Returns values based on time.
- Returns the chronologically
earliest(x)
(oldest) orlatest(x)
(most recent) seen occurrence of a value in a field. - Returns the UNIX time of the
earliest_time(x)
(oldest) orlatest_time(x)
(most recent) to calculate the rate of increase for an accumulating counter.
- Returns the chronologically
stats
, chart
, and timechart
share similar features. Use proper command to get wanted results: stats
for the table, and the others for visualizations.
Performs calculations with values in our data. An eval expression is a combination of literals. fields, operators, and functions that represent the values of the destination field. Calculates the expression and puts the resulting values into a new field or overwrites an existing one. Creates a new field on the fly, populated with the expression's result, that can be used as any regular field in the remainder of the search expression. Nothing written with the eval command is kept after the lifetime of the search it was used in. The new field is not saved in the index nor it will be available again after the search is completed. Either that the eval command temporally can overwrite the values present in a previously existing field, no change of our data is permanent. New values are not written to disk anyway.
Eval involves:
- Mathematical operation.
- String concatenation. Use
+
for String or character and.
for any data type. - Comparison expression.
- Boolean expressions.
- A call to an
eval
function. These are the operators
Field values are case-sensitive
Strings must be double-quoted
Field names must be single-quoted when contain special characters
There are 11 categories of evaluation functions
- Comparison & Conditional
- Conversion
- Cryptographic (md5, sha1, sha256, sha512)
- Data & time
- Informational
- JSON
- Mathematical (round(X,Y), pow). A
round
withoutY
returnsX
as an integer. - Multivalue
- Statistical (avg, max, min, random).
random
returns a Pseudo-random integer ranging from zero to 2³¹-1 - Text
- Trigonometry and hyperbolic
Generally, Evaluation functions will evaluate an expression based on the events and return a result, but some do not evaluata any expression and instead return a result based on its own functionality
Create five random groups of users.
Eval usage without functions.
The eval command can be used as a function within the stats
command.
Nest eval inside a stats count
to count events with a calculated value.
Requires an as
clause to rename the field.
Helps to display a more useful or meaningful field name.
Double quote the new field name if has to contain any special character.
Multiple field renaming is possible in a single command. notice commas
Once a field is renamed Splunk will only respond to or be able to use the new name of that field in the rest of the search.
Wildcare usage in renaming fields
Sorts in ascending order by default. -
changes it to descending order. +
is implicit so not required.
Double quote field name when containing special characters or white spaces.
limit the number of results with the limit
argument or just put an integer.
Splunk determines the data type of values present in the field and sorts appropriately:
- Alphabetic Strings: Lexicographically. uppercase letters before lowercase
- numbers: Numerically
- Combination: Depending on the first character lexicographically or numerically.
Please notice the white spaces after -
, it applies to both fields. In the first example, orders descend on both of them. But in the second example, without white space, -
only applies to the first field being the second field ordered in ascending mode.