-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingester behavior when disk is full #5589
Comments
@fulmicoton How did you identify the main problem comes from records accumulating in the OS buffer? I thought the OS buffer would usually be quite small (few MBs). It seems to me that the problem might also come from the persist policy that is configured on EDIT: I tried to mimic the WAL disk being full using a small loop device mounted on
The error I get (and I get it conistently) when the disk is full is:
|
Just an hypothesis to explain how we could accept message and eventually lose them.
Plausible yes, but we still need to know by which mechanism we end up accepting writes sometimes. Mezmo mentions they lost data. |
Currently, ingester may end up accepting persist request when their disk is full.
If the OS buffer is not full, no error might be returned.
We need to poll-check for the disk usage, and change the behavior of quickwit when it goes above
a threshold.
The behavior is yet to be decided. Probably, the closest thing is decommissionning: close all shards and not accept the creation of new shards. In addition, it might not be possible to run indexing/merge pipelines; which could really make the control plane's task hard.
The text was updated successfully, but these errors were encountered: