-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filelog receiver reads old logs even when start_at is set to end #36091
Comments
Just to be clear, neither The |
Perhaps related to #32727. Not sure though |
Maybe, what's happening is:
|
I need to check if it's even possible that k8s does that. The files I am parsing are all and only kubernetes log files (pod logs). Just to be clear: what you are saying is that k8s might create a new log file (maybe And one more clarification before I start looking at the files (I think I will look into whether there are files with creation_timestamp that is more recent than the first log row's timestamp). |
I confirm that this issue is NOT due to k8s renaming files, but it happens at startup. I am deploying a stack of agent replicas via helm charts. When I run
(this is due to our observability backend that collects data sent via OTEL agent rejecting the logs with 422 because they have outdated timestamps). Is it possible that a change in the agent config/code makes the receiver start from the beginning of the logs at startup, even if we specify |
No, the receiver will not read a moved file at all. It will recognize it as the same file it's previously seen and ignore it (unless there are new lines to read, in which case it will read those)
Correct
Pretty much, but to be more specific, it is only relevant the first time the receiver polls for files. If there is a storage extension with previously saved state, then the receiver knows it is not the first time it's polled for files. |
If the files appear after the receiver has already been running, then it knows they are new files and reads them from the beginning. |
We also have a need to read files from the end even after the receiver starts. For example, when we are already using storage and need to add a huge new file, but there is no need to read it from the beginning |
Maybe it would be better to use the start_at setting together with the storage extension to not only start the receiver, but also for all new files (with offset = 0)? I could try to implement this |
It's not clear to me what you're suggesting. How should the receiver (with or without the storage extension) know whether a file is a new file that should be consumed entirely vs a new file whose contents should be ignored initially? |
I suggest adding an additional condition to check the offset when reading the file. This additional check could look something like this: if !f.FromBeginning || (m.Offset <= 0 && f.StartAt == "end") {
var info os.FileInfo
if info, err = r.file.Stat(); err != nil {
return nil, fmt.Errorf("stat: %w", err)
}
r.Offset = info.Size()
} The point is that if the offset is 0 then this is a new file and it is better not to read it from the beginning, unless the option startAt explicitly has the value "beginning" |
Hi @djaglowski I am still an issue where filelog starts reading from beginning of files despite a fix that was mentioned in this other issue.
That issue mentions that when the max concurrent files is reached, the receiver enters "batch" mode, and starts reading from the beginning of the files, regardless of what "start_at" is set to.
I made sure
start_at
is set toend
and I have checked the receiver's logs, nothing mentions that any batching happened.Also given the amount of log files I have in the system, I can pretty much ensure that the default limit of 1024 concurrent files has ever been reached.
For instance, on
16/10/24 13:22:24.363 CEST
, the receiver read a log line with timestamp14/05/24 14:40:41.042 CEST
, which is >5 months old.From this screenshot you can see:
BUNDLE_TIMESTAMP is when the log line was parsed (technically it's when it reached our system, but it's likely just a few seconds earlier)
timestamp is the log's timestamp
delta is the discrepancy between the two. The logs you see are 4 months old
The purple bits above are when the data falls over time. You can see that this did not happen constantly but only twice, with many instances of very old logs being read all together
This is how I configured the receiver:
The text was updated successfully, but these errors were encountered: