-
Notifications
You must be signed in to change notification settings - Fork 5k
[9.1](backport #47247) [Filebeat/Filestream] Fix missing last few lines of a file #47749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 9.1
Are you sure you want to change the base?
Conversation
🤖 GitHub commentsJust comment with:
|
|
This pull request is now in conflicts. Could you fix it? 🙏 |
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Proposed commit message
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration files./changelog/fragmentsusing the changelog tool.## Disruptive User ImpactAuthor's Checklist
How to test this PR locally
Run the tests
Manual test
Testing this fix manually is possible, but requires you to monitor the
logs and add data to the file being ingested at a specific time.
At a very high level, the steps are:
harvester
If you ran this test without the fix from this PR, after
#4Filestream will not try to start any more harvesters for the file,
effectively missing the last few lines.
The best way to manually test this PR is to have two terminals open,
one running Filebeat and another ready to append data to the file
Filebeat is ingesting.
Create a file with at least 1kb of data and write down its size
flog -n 20 > /tmp/flog.log wc -c /tmp/flog.logStart Filebeat with following config:
filebeat.yml
To make the logs easier to read, you can send the logs to stdout
and pipe them through jq:
Wait for the log entry:
'/tmp/flog.log' is inactiveAdd data to the file
flog -n 2 >> /tmp/flog.logWait for the log entry:
File /tmp/flog.log has been updatedWait for the log entry:
Harvester already runningWait for the log entry:
File is inactive. Closing. Path='/tmp/flog.log'Wait for the log entry:
Stopped harvester for fileWait for the log entry:
Updating previous state because harvester was closed. '/tmp/flog.log': xxx, wherexxxis the original file size.Wait for the log entry:
File /tmp/flog.log has been updatedWait for the log entry:
Starting harvester for fileWait for the log entry:
End of file reached: /tmp/flog.log; Backoff now.Ensure all events have been read:
wc -l output*.ndjson.Related issues
## Use cases## Screenshots## LogsBenchmarks
Go Benchmark
This is likely not very relevant to the final form of this PR, but I ran some benchmarks comparing the different strategies to prevent the race condition when accessing the
offsetandlastTimeReadin the harvester, below are the results and the codefilebeat/input/filestream/filestream_test.go
Benchbuilder
Latest release: v9.2.1
9.2.12m43.075351941s12264.000000175.31629.2.148.46343038s41269.000000183.11839.2.12m47.897040994s11912.000000176.51489.2.14m51.107096736s6870.000000178.5985PR version
9.3.02m41.103916351s12414.000000175.37349.3.047.520195331s42088.000000182.56259.3.02m44.102216849s12188.000000175.83899.3.04m56.598482898s6743.000000179.3721This is an manual backport of pull request #47247 done by @belimawr