Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't call file.length() for every append if we can avoid it #790

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

axiak
Copy link

@axiak axiak commented Mar 14, 2024

This change introduces a way for the ResiliantFileOutputStream to track its own position in the file. This means that instead of calling File.length() (which ends up being a syscall) for each event, we only call it when we're recovering a stream.

This change introduces a way for the ResiliantFileOutputStream
to track its own position in the file. This means that instead of
calling File.length() (which ends up being a syscall) for each
event, we only call it when we're recovering a stream.
@ceki
Copy link
Member

ceki commented Mar 16, 2024

@axiak Hi Michael,

Thank you for this contribution. When you say that File.length() is called for each event, on which execution path do you think this occurs?


private final OutputStream delegate;

private long count;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be volatile. While there is external synchronization using various locks when calling write(), the new state of the variable may not be reflected immediately in other threads after this thread updates it. The increment operations += and ++ are actually each 2 operations, a read and then a write, this means the count may be incorrectly updated by another thread afterwards, as the value it reads could be stale.


spy.getChannel().close();
spy.write("b".getBytes());
spy.flush();
// we have 2 in our countingoutput stream
// but the 'b' write failed due to the channel closing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the count can become inaccurate if any IOException occurs, I think it would be best to mitigate that by either only updating the count after the operation succeeds, or alternatively, if we still want to eagerly increment the count we should catch any exceptions, decrement the count, then rethrow the exception again. This would ensure the count remains accurate.

Copy link

@griffinjm griffinjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally clicked "start review", see my previous comments regarding the count atomicity and error handling.

@griffinjm
Copy link

@axiak Hi Michael,

Thank you for this contribution. When you say that File.length() is called for each event, on which execution path do you think this occurs?

From my read through the file length check occurs only when the rollover check interval is passed. For all standard TriggeringPolicy implementations it is not on every append.

This would be an optimization when checking for rollover if someone has a low rollover check interval. Depending on the triggering policy this can vary, for SizeBased it is by default 60 seconds in DefaultInvocationGate, but configurable, TimeBased are derived from the filename pattern provided. If someone has a filename date pattern which causes the policy to use a very low periodicity, e.g. MILLISECONDS or SECONDS, or someone configures the InvocationGate for SizeBased with a very low value.

In these limited cases this could be an improvement, in exchange for extra state and complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants