Skip to content

feat(docker_logs source): add checkpointing#24869

Draft
vincentbernat wants to merge 2 commits intovectordotdev:masterfrom
vincentbernat:feature/docker-logs-checkpoint
Draft

feat(docker_logs source): add checkpointing#24869
vincentbernat wants to merge 2 commits intovectordotdev:masterfrom
vincentbernat:feature/docker-logs-checkpoint

Conversation

@vincentbernat
Copy link

Summary

Like for the file source, the docker_logs source now record checkpoints at regular interval (5s) and at shutdown in a file. The checkpointing logic is mostly stolen from the file source and maybe it would be possible to abstract it later.

This also introduces an option to start from the last message. If since_now is true, the source does not look at old logs. This is false by default, so it is a behavior change, but I think it is a better default. We can still switch this if needed.

Also, with the recent changes around documentation, I don't know if I should generate the documentation myself or not. I am creating this PR as a draft as I am not familiar with the code base and I am quite junior in Rust.

Vector configuration

https://github.com/akvorado/akvorado/blob/main/docker/vector.yaml
(but I have also relied a lot on the integration tests).

How did you test this PR?

See above.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes (small one, change of the default behavior on first run)
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Like for the file source, the docker_logs source now record checkpoints at regular interval (5s) and
at shutdown in a file. The checkpointing logic is mostly stolen from the file source and maybe it
would be possible to abstract it later.

This also introduces an option to start from the last message. If `since_now` is true, the source
does not look at old logs. This is false by default, so it is a behavior change, but I think it is a
better default. We can still switch this if needed.

Fix vectordotdev#7358
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@github-actions github-actions bot added domain: sources Anything related to the Vector's sources domain: external docs Anything related to Vector's external, public documentation labels Mar 8, 2026
@vincentbernat
Copy link
Author

The test is flaky. It's easy to fix, but then it is less meaningful. The main issue is that docker logs don't have a cursor, so we use a timestamp (with a precision of 1 second). We can choose between duplicate logs or loosing logs. Currently, this is loosing logs. We could switch to duplicate logs.

@vincentbernat
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@simonhammes
Copy link
Contributor

The main issue is that docker logs don't have a cursor, so we use a timestamp (with a precision of 1 second). We can choose between duplicate logs or loosing logs. Currently, this is loosing logs. We could switch to duplicate logs.

Out of sheet curiosity, wouldn't it be possible to reuse the existing logic for log files for container logs? This would require accessing /var/lib/docker/containers, but would solve this problem, right?

Afaik this is also how Promtail approaches this problem.

@vincentbernat
Copy link
Author

It would work if the Docker daemon is local. It does not work if it is remote or if Vector is running in Docker itself (with an access to the socket). If Docker were to store the logs in /var/lib/docker/logs, this would have been simpler since it would enable to share them easily with Vector.

@simonhammes
Copy link
Contributor

It does not work if it is remote

You mean when using Swarm? Yeah, this would be a downside.

or if Vector is running in Docker itself (with an access to the socket). If Docker were to store the logs in /var/lib/docker/logs, this would have been simpler since it would enable to share them easily with Vector.

Hmm, just mount /var/lib/docker/containers into the Vector container?

@vincentbernat
Copy link
Author

Hmm, just mount /var/lib/docker/containers into the Vector container?

You get access to far more than the logs. This would be a concern.

Another solution would be to use another provider, like fluent which is supported by both Docker and Vector, but I don't know if Docker would buffer the logs if Vector is temporarily unavailable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation domain: sources Anything related to the Vector's sources

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support checkpointing and reading old logs with docker_logs source

2 participants