-
Notifications
You must be signed in to change notification settings - Fork 407
Add a feature 'dead size' to lsf #477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I haven't looked at the code yet because the CLA isn't signed for you yet. As for the idea, I don't think logstash-forwarder should stop until some external force rotates lsf's own logfile. If rotation is needed, lsf should do that itself. |
|
Oh. @jordansissel, I think I put a confusing message in the commit. s/writing/reading. The 'dead size' parameter enables user to put a threshold on the maximum amount of data to be forwarded for a given log file. So, for example, my application writes logs to the file /var/log/myapp.log and lsf is configured to ship it's contents somewhere, then it should stop harvesting the file as soon as it crosses the 'dead size' for that file. But should resume from 0, when the file is rotated. |
|
What is the goal of stopping reading? What consequence is the length of the On Monday, June 22, 2015, Nehal J Wani notifications@github.com wrote:
|
|
Suppose I have configured my webapp to log stuff with level INFO and WARN and expect the size of the log file to be 5GB for one day. But it so happens that due to some edge case, the webapp starts throwing exception(s), and keeps doing so every 15 seconds. This generates a very very huge log file. So, if the content of the log is similar for the past, say 1GB, then it isn't worth continuing shipping the same content to elasticsearch. I end up paying for more network bandwidth and have redundant data. If 'dead size' is available, I can put a max threshold to about 7.5GB. In the worst case if ES doesn't help me debug the error, or the logs are indeed legit because of high traffic on my web app, I can always look at the RAW log file and increase the threshold value the very next day. |
|
Hmm.. What about instead of a "file length" stopping point, there's some kind of quota based on bytes-over-time, not file offset? |
|
Since we use logrotate every hour on the log files, that didn't strike me for our specific use case. But yes, a quota based on bytes-over-time makes more sense. In that case, the values for 'dead size' would be something like '10GB/24h' or '500MB/1m'. I'll try to implement this and send another pull request. What do you say? |
c075549 to
0a10a8b
Compare
This feature enables users to specify a dead size for the log files. As soon as the size of the log file exceeds the gives size, lsf will stop harvesting logs from that file. It will resume writing, only when the log file is rotated or renamed. Changes: * config.go: Rename deadtime to deadTimeDur to explicitely determine it's usage Introduce DeadSize (string) and deadSizeVal (uint64) * harvestor.go: Apply check for dead size. Stop harvesting once exceeded. * logstash-forwarder.go: Introduce a global map and mutex for deadfile db. * prospector.go: Resume harvestor on dead files only under certain conditions. * bytefmt.go: Add function to convert size-string to bytes * bytefmt_test.go: Add unit testcases for the new conversion function
This feature enables users to specify a dead size for the log files.
As soon as the size of the log file exceeds the gives size, lsf will
stop reading logs from the file. It will resume reading, only when
the log file is rotated or renamed.
Changes:
Rename deadtime to deadTimeDur to explicitely determine it's usage
Introduce DeadSize (string) and deadSizeVal (uint64)
Apply check for dead size. Stop harvesting once exceeded.
Introduce a global map and mutex for deadfile db.
Resume harvestor on dead files only under certain conditions.
Add function to convert size-string to bytes
Add unit testcases for the new conversion function