feat(docker_logs source): add checkpointing#24869
feat(docker_logs source): add checkpointing#24869vincentbernat wants to merge 2 commits intovectordotdev:masterfrom
Conversation
Like for the file source, the docker_logs source now record checkpoints at regular interval (5s) and at shutdown in a file. The checkpointing logic is mostly stolen from the file source and maybe it would be possible to abstract it later. This also introduces an option to start from the last message. If `since_now` is true, the source does not look at old logs. This is false by default, so it is a behavior change, but I think it is a better default. We can still switch this if needed. Fix vectordotdev#7358
|
All contributors have signed the CLA ✍️ ✅ |
|
The test is flaky. It's easy to fix, but then it is less meaningful. The main issue is that docker logs don't have a cursor, so we use a timestamp (with a precision of 1 second). We can choose between duplicate logs or loosing logs. Currently, this is loosing logs. We could switch to duplicate logs. |
|
I have read the CLA Document and I hereby sign the CLA |
Out of sheet curiosity, wouldn't it be possible to reuse the existing logic for log files for container logs? This would require accessing Afaik this is also how Promtail approaches this problem. |
|
It would work if the Docker daemon is local. It does not work if it is remote or if Vector is running in Docker itself (with an access to the socket). If Docker were to store the logs in /var/lib/docker/logs, this would have been simpler since it would enable to share them easily with Vector. |
You mean when using Swarm? Yeah, this would be a downside.
Hmm, just mount |
You get access to far more than the logs. This would be a concern. Another solution would be to use another provider, like fluent which is supported by both Docker and Vector, but I don't know if Docker would buffer the logs if Vector is temporarily unavailable. |
Summary
Like for the file source, the docker_logs source now record checkpoints at regular interval (5s) and at shutdown in a file. The checkpointing logic is mostly stolen from the file source and maybe it would be possible to abstract it later.
This also introduces an option to start from the last message. If
since_nowis true, the source does not look at old logs. This is false by default, so it is a behavior change, but I think it is a better default. We can still switch this if needed.Also, with the recent changes around documentation, I don't know if I should generate the documentation myself or not. I am creating this PR as a draft as I am not familiar with the code base and I am quite junior in Rust.
Vector configuration
https://github.com/akvorado/akvorado/blob/main/docker/vector.yaml
(but I have also relied a lot on the integration tests).
How did you test this PR?
See above.
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
docker_logssource #7358