Add a feature 'dead size' to lsf #477

nehaljwani · 2015-06-22T22:36:16Z

This feature enables users to specify a dead size for the log files.
As soon as the size of the log file exceeds the gives size, lsf will
stop reading logs from the file. It will resume reading, only when
the log file is rotated or renamed.

Changes:

config.go:
Rename deadtime to deadTimeDur to explicitely determine it's usage
Introduce DeadSize (string) and deadSizeVal (uint64)
harvestor.go:
Apply check for dead size. Stop harvesting once exceeded.
logstash-forwarder.go:
Introduce a global map and mutex for deadfile db.
prospector.go:
Resume harvestor on dead files only under certain conditions.
bytefmt.go:
Add function to convert size-string to bytes
bytefmt_test.go:
Add unit testcases for the new conversion function

jordansissel · 2015-06-23T00:00:26Z

I haven't looked at the code yet because the CLA isn't signed for you yet.

As for the idea, I don't think logstash-forwarder should stop until some external force rotates lsf's own logfile. If rotation is needed, lsf should do that itself.

nehaljwani · 2015-06-23T00:33:35Z

Oh. @jordansissel, I think I put a confusing message in the commit. s/writing/reading. The 'dead size' parameter enables user to put a threshold on the maximum amount of data to be forwarded for a given log file. So, for example, my application writes logs to the file /var/log/myapp.log and lsf is configured to ship it's contents somewhere, then it should stop harvesting the file as soon as it crosses the 'dead size' for that file. But should resume from 0, when the file is rotated.

jordansissel · 2015-06-23T01:00:53Z

What is the goal of stopping reading? What consequence is the length of the
file being read?

On Monday, June 22, 2015, Nehal J Wani notifications@github.com wrote:

Oh. @jordansissel https://github.com/jordansissel I think I put a
confusing message in the commit. s/writing/reading. The 'dead size'
parameter enables user to put a threshold on the maximum amount of data to
be forwarded for a given log file. So, for example, my application writes
logs to the file /var/log/myapp.log and lsf is configured to ship it's
contents somewhere, then it should stop harvesting the file as soon as it
crosses the 'dead size' for that file. But should resume from 0, when the
file is rotated.

—
Reply to this email directly or view it on GitHub
#477 (comment)
.

nehaljwani · 2015-06-23T03:32:55Z

Suppose I have configured my webapp to log stuff with level INFO and WARN and expect the size of the log file to be 5GB for one day. But it so happens that due to some edge case, the webapp starts throwing exception(s), and keeps doing so every 15 seconds. This generates a very very huge log file. So, if the content of the log is similar for the past, say 1GB, then it isn't worth continuing shipping the same content to elasticsearch. I end up paying for more network bandwidth and have redundant data. If 'dead size' is available, I can put a max threshold to about 7.5GB. In the worst case if ES doesn't help me debug the error, or the logs are indeed legit because of high traffic on my web app, I can always look at the RAW log file and increase the threshold value the very next day.
This feature becomes more helpful when I have different webapps running on different servers, all forwarding to a single ES cluster. In such a scenario, if one of them hogs bandwidth, the I loose the legit logs generated by the other webapps.
It's rare, but another scenario can be that the webapp logs too much unnecessary stuff, or is too verbose, which somehow didn't catch the eye during code review.

jordansissel · 2015-07-07T23:42:52Z

Hmm.. What about instead of a "file length" stopping point, there's some kind of quota based on bytes-over-time, not file offset?

nehaljwani · 2015-07-08T16:14:47Z

Since we use logrotate every hour on the log files, that didn't strike me for our specific use case. But yes, a quota based on bytes-over-time makes more sense. In that case, the values for 'dead size' would be something like '10GB/24h' or '500MB/1m'. I'll try to implement this and send another pull request. What do you say?

This feature enables users to specify a dead size for the log files. As soon as the size of the log file exceeds the gives size, lsf will stop harvesting logs from that file. It will resume writing, only when the log file is rotated or renamed. Changes: * config.go: Rename deadtime to deadTimeDur to explicitely determine it's usage Introduce DeadSize (string) and deadSizeVal (uint64) * harvestor.go: Apply check for dead size. Stop harvesting once exceeded. * logstash-forwarder.go: Introduce a global map and mutex for deadfile db. * prospector.go: Resume harvestor on dead files only under certain conditions. * bytefmt.go: Add function to convert size-string to bytes * bytefmt_test.go: Add unit testcases for the new conversion function

nehaljwani force-pushed the dead-size branch 4 times, most recently from c075549 to 0a10a8b Compare July 8, 2015 17:10

nehaljwani force-pushed the dead-size branch from 0a10a8b to eee3674 Compare July 8, 2015 17:24

ruflin added the filebeat label Sep 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a feature 'dead size' to lsf #477

Add a feature 'dead size' to lsf #477

Uh oh!

nehaljwani commented Jun 22, 2015

Uh oh!

jordansissel commented Jun 23, 2015

Uh oh!

nehaljwani commented Jun 23, 2015

Uh oh!

jordansissel commented Jun 23, 2015

Uh oh!

nehaljwani commented Jun 23, 2015

Uh oh!

jordansissel commented Jul 7, 2015

Uh oh!

nehaljwani commented Jul 8, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add a feature 'dead size' to lsf #477

Are you sure you want to change the base?

Add a feature 'dead size' to lsf #477

Uh oh!

Conversation

nehaljwani commented Jun 22, 2015

Uh oh!

jordansissel commented Jun 23, 2015

Uh oh!

nehaljwani commented Jun 23, 2015

Uh oh!

jordansissel commented Jun 23, 2015

Uh oh!

nehaljwani commented Jun 23, 2015

Uh oh!

jordansissel commented Jul 7, 2015

Uh oh!

nehaljwani commented Jul 8, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants