The core of Grafana Agent is considered stable and suitable for production use. Features and other functionality that are subject to change and are not recommended for production use will be tagged interchangably as either "beta" or "experimental."
Host Filtering implements a form of "dumb sharding," where operators may deploy one Grafana Agent instance per machine in a cluster, all using the same configuration, and the Grafana Agents will only scrape targets that are running on the same node as the Agent.
Running with host_filter: true
means that if you have a target whose host
machine is not also running a Grafana Agent process, that target will not
be scraped!
Host Filtering is usually paired with a dedicated Agent process that is used for
scraping targets that are running outside of a given cluster. For example, when
running the Grafana Agent on GKE, you would have a DaemonSet with
host_filter
for scraping in-cluster targets, and a single dedicated Deployment
for scraping other targets that are not running on a cluster node, such as the
Kubernetes control plane API.
If you want to scale your scrape load without host filtering, you may use the scraping service instead.
The host name of the Agent is determined by reading $HOSTNAME
. If $HOSTNAME
isn't defined, the Agent will use Go's os.Hostname
to determine the hostname.
The following meta-labels are used to determine if a target is running on the same machine as the target:
__address__
__meta_consul_node
__meta_dockerswarm_node_id
__meta_dockerswarm_node_hostname
__meta_dockerswarm_node_address
__meta_kubernetes_pod_node_name
__meta_kubernetes_node_name
__host__
The final label, __host__
, isn't a label added by any Prometheus service
discovery mechanism. Rather, __host__
can be generated by using
host_filter_relabel_configs
. This allows for custom relabeling
rules to determine the hostname where the predefined ones fail. Relabeling rules
added with host_filter_relabel_configs
are temporary and just used for the
host_filtering mechanism. Full relabeling rules should be applied in the
appropriate scrape_config
instead.
Note that scrape_config relabel_configs
do not apply to the host filtering
logic; only host_filter_relabel_configs
will work.
If the determined hostname matches any of the meta labels, the discovered target is allowed. Otherwise, the target is ignored, and will not show up in the targets API.
The Grafana Agent defines a concept of a Prometheus Instance, which is
its own mini Prometheus-lite server. The Instance runs a combination of
Prometheus service discovery, scraping, a WAL for storage, and remote_write
.
Instances allow for fine grained control of what data gets scraped and where it gets sent. Users can easily define two Instances that scrape different subsets of metrics and send them to two completely different remote_write systems.
Instances are especially relevant to the scraping service mode, where breaking up your scrape configs into multiple Instances is required for sharding and balancing scrape load across a cluster of Agents.
The v0.5.0 release of the Agent introduced the concept of Instance sharing,
which combines scrape_configs from compatible Instance configs into a single,
shared Instance. Instance configs are compatible when they have no differences
in configuration with the exception of what they scrape. remote_write
configs
may also differ in the order which endpoints are declared, but the unsorted
remote_writes
must still be an exact match.
In the shared Instances mode, the name
field of remote_write
configs is
ignored. The resulting remote_write
configs will have a name identical to the
first six characters of the group name and the first six characters of the hash
from that remote_write
config separated by a -
.
The shared Instances mode is the new default, and the previous behavior is
deprecated. If you wish to restore the old behavior, set instance_mode: distinct
in the
prometheus_config
block of
your config file.
Shared Instances are completely transparent to the user with the exception of
exposed metrics. With instance_mode: shared
, metrics for Prometheus components
(WAL, service discovery, remote_write, etc) have a instance_group_name
label,
which is the hash of all settings used to determine the shared instance. When
instance_mode: distinct
is set, the metrics for Prometheus components will
instead have an instance_name
label, which matches the name set on the
individual Instance config. It is recommended to use the default of
instance_mode: shared
unless you don't mind the performance hit and really
need granular metrics.
Users can use the targets API to see all scraped targets, and the name of the shared instance they were assigned to.