Configuring agent.logging.level through agent policy do not work for managed agents #2851

nchaulet · 2023-06-13T13:02:12Z

Issue

We now have a feature that allow to add arbitrary key to the agent policy sent to the agent, elastic/kibana#159414 (comment)

I tried to set agent log level this way to Fleet managed agents, and it seems the agent is still logging at the default log level, when I run elastic-agent inspect I saw that my agent.logging.level: debug config is here, but it seems not used by the agent

Definition of done

Logging level is taking into account the one in the policy
Test confirming the logging level change is working as expected
Test covering the logging level change from settings action too
Test covering an unexisting logging level target

The text was updated successfully, but these errors were encountered:

cmacknz · 2023-06-13T18:18:42Z

Seems like the logging level in the policy is ignored in the policy change handler, we always persist the log level we started with.

elastic-agent/internal/pkg/agent/application/actions/handlers/handler_action_policy_change.go

Line 204 in 409682e

reader, err := fleetToReader(h.agentInfo, h.config)

elastic-agent/internal/pkg/agent/application/actions/handlers/handler_action_policy_change.go

Line 274 in 409682e

"logging.level": cfg.Settings.LoggingConfig.Level,

We only support changing the log level through the SETTINGS action but I have no idea why we did it this way.

elastic-agent/internal/pkg/agent/application/actions/handlers/handler_action_settings.go

Line 53 in 409682e

err := lvl.Unpack(action.LogLevel)

jlind23 · 2023-06-13T18:45:57Z

But agent debug logs can still be turned out with the Ui change and not the api call Nicolas mentioned an I correct?

cmacknz · 2023-06-13T18:54:00Z

Yes, this is unintuitive but you can still change the log level from the UI with the SETTINGS action.

jlind23 · 2023-06-13T18:57:30Z

So at least we have a workaround for this..

pchila · 2023-07-03T10:06:00Z

@cmacknz I had a look at the code and I guess that we ignore the log level in the handler_policy_change_action due to an issue with deserialization:
the log level is defined as an int8 within agent where 0 corresponds to info level and we use a map to convert the string we receive in the policy: when unmarshaling a policy that does not contain the log level (all of them unless we use the new fleet entry point), we are probably going to get and use the zero value (an empty string) which will likely throw an error (to be tested).

To solve this issue we have to optionally unpack the log level only when the key is specified and fall back on the configured value if it's not. Another possible solution would be to transform the Level into a *Level and then add the current log level value if the specified Level is nil (probably a bit cleaner) but we can't change it easily as it's part of elastic-agent-libs... 😞

cmacknz · 2023-07-04T15:39:12Z

but we can't change it easily as it's part of elastic-agent-libs...

The two largest users of elastic-agent-libs are Agent and Beats (and things that implement a Beat outside of the Beats repository). As long as the change wouldn't be breaking for Beats we could do this if it makes more sense, and if it is breaking we could just make it an option or variant of the existing behavior.

michalpristas · 2023-07-17T09:57:03Z

bringing a bit of a history into this.
the proposed change should not work in a way how agent is designed. i'm seeing this as a feature request not a bug.

the idea behind POLICY and SETTING was this
POLICY contains information about Inputs, Output and Fleet connection info
SETTINGS is agent specific setting

when we start thinking about supporting setting Log Level using settings I'd like us to stop for a while and think about how this should behave first.

At this time it' simple, default log level is info, we persist it as a setting. Once there's a need to switch to debug you will switch to debug and we persist it as a configuration option, this overrides default one. Once you're done you'll go back to Info.

Once you introduce setting log level using POLICY at the start we end up with a pair info:nil meaning default using settings is info, and it is not set using policy

then you change policy to use warn ending up with info:warn the question is which one has priority, we dont have information about the source of the first one.

Even if we consider not setting info at start, ending up with nil:warn. We will use warn.

Now setting from Debug SETTINGS will come, we have debug:warn using debug with a higher priority as it is per agent config.

The question is, how we reset per agent config. if we set whatever, we will end up with whatever:warn with whatever always taking priority even with future Policy changes.

When I take a look at diagnostics I should be deterministically say what's the log level based on Policy and Settings without worrying about order.

From my point of view we should either

prevent agent config being set in POLICY (as it was until now)
add Inherit log level or Reset function on Fleet UI which would result in pair nil:whatever

Getting rid of SETTINGs does not make sense. In the need of debug, changing log level to Debug for potentially 100k agent would create a spike in processing requirement, memory and cost eventually.

@cmacknz offline this week, pinging @joshdover: do we have thought through use cases for this?

joshdover · 2023-07-17T12:23:49Z

Thanks for the history, @michalpristas. Makes sense why this is behaving this way.

I think this is somewhat clear already, but just to be concise the use cases we have are:

Have a sensible default log level (info)
Set the log level for a group of agents (policy)
- Example use case: trying to find a infrequent bug that is happening in a pool of agents pulling from an SQS queue
Increase the log verbosity on a single agent as a one-off to debug an issue, without affecting other agents on the policy

IMO the precedence order should be:

Default to 1
2 always overrides 1
3 always overrides 2

In general we should track which agents have a one-off setting applied via the SETTINGS action and should flag those to the user in the UI. This is essentially "drift detection" for agents that are deviating from their policy, and the user should be encouraged to reset those agents back to the policy.

add Inherit log level or Reset function on Fleet UI which would result in pair nil:whatever

Yeah I agree with this, there needs to be some way to reset (3) and start using (2) again.

So in conclusion, I agree this is not a bug, but a new feature and use case: controlling the log level via the agent policy. We've seen this request before and it should be prioritized, but as an enhancement and not a bug.

joshdover · 2023-07-17T12:24:30Z

@pierrehilbert If you agree, let's deprioritize this one for now.

pierrehilbert · 2023-07-17T15:48:25Z

Thanks everyone for your insights and I agree with the precedence order mentioned by @joshdover.
I changed the issue type and priority according to this new agreement.
Still keeping this in the current sprint for now but with a lower priority compare to other items and will see if we need some other adjustments.

michalpristas · 2023-07-18T09:15:21Z

keep in mind that in order for this to work properly we need reset fucntionality or Policy log level in fleet in place. Without this once you set log level using settings all log levels coming from Policy will be ignored.

from planning perspective it does not make sense to work on this until fleet part is at least scheduled for development

joshdover · 2023-07-19T10:39:43Z

Agreed, we will need support for this in Fleet. It's likely a small enhancement, but needs to be planned for.

elasticmachine · 2024-04-26T03:30:22Z

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

nchaulet added the bug Something isn't working label Jun 13, 2023

nchaulet mentioned this issue Jun 13, 2023

[Fleet] Allow to overrides agent policy elastic/kibana#159414

Merged

2 tasks

cmacknz added the Team:Elastic-Agent Label for the Agent team label Jun 13, 2023

jlind23 assigned pchila Jun 14, 2023

pchila mentioned this issue Jul 17, 2023

Support log level setting from policy #3090

Merged

3 tasks

pierrehilbert added enhancement New feature or request and removed bug Something isn't working labels Jul 17, 2023

juliaElastic mentioned this issue Apr 9, 2024

[Fleet] Support changing the default log level per policy (or globally) elastic/kibana#158861

Closed

nimarezainia mentioned this issue Apr 10, 2024

Reduce the amount the agent logs by default #4252

Open

pchila mentioned this issue Apr 11, 2024

Refactor action policy change handler #4563

Merged

2 tasks

ycombinator added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Apr 26, 2024

pchila mentioned this issue May 8, 2024

Log raw events to a separate log file #4549

Merged

17 tasks

pchila closed this as completed in #3090 May 14, 2024

pchila mentioned this issue Jul 12, 2024

Agent policy logging level is not applied to agents upgraded from pre-8.15.0 #5116

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuring agent.logging.level through agent policy do not work for managed agents #2851

Configuring agent.logging.level through agent policy do not work for managed agents #2851

nchaulet commented Jun 13, 2023 •

edited by pierrehilbert

Loading

cmacknz commented Jun 13, 2023

jlind23 commented Jun 13, 2023

cmacknz commented Jun 13, 2023

jlind23 commented Jun 13, 2023

pchila commented Jul 3, 2023 •

edited

Loading

cmacknz commented Jul 4, 2023 •

edited

Loading

michalpristas commented Jul 17, 2023 •

edited

Loading

joshdover commented Jul 17, 2023

joshdover commented Jul 17, 2023

pierrehilbert commented Jul 17, 2023

michalpristas commented Jul 18, 2023

joshdover commented Jul 19, 2023

elasticmachine commented Apr 26, 2024

Configuring agent.logging.level through agent policy do not work for managed agents #2851

Configuring agent.logging.level through agent policy do not work for managed agents #2851

Comments

nchaulet commented Jun 13, 2023 • edited by pierrehilbert Loading

Issue

Definition of done

cmacknz commented Jun 13, 2023

jlind23 commented Jun 13, 2023

cmacknz commented Jun 13, 2023

jlind23 commented Jun 13, 2023

pchila commented Jul 3, 2023 • edited Loading

cmacknz commented Jul 4, 2023 • edited Loading

michalpristas commented Jul 17, 2023 • edited Loading

joshdover commented Jul 17, 2023

joshdover commented Jul 17, 2023

pierrehilbert commented Jul 17, 2023

michalpristas commented Jul 18, 2023

joshdover commented Jul 19, 2023

elasticmachine commented Apr 26, 2024

nchaulet commented Jun 13, 2023 •

edited by pierrehilbert

Loading

pchila commented Jul 3, 2023 •

edited

Loading

cmacknz commented Jul 4, 2023 •

edited

Loading

michalpristas commented Jul 17, 2023 •

edited

Loading