You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This effectively means power failures etc. on more than 1/3 of a cluster have a chance to render the cluster unusable, at least until you can fix the leveldb recovery code or restore from non-corrupt backups.
I can't see tmlibs, so I'm not exactly sure how this dependency works, but from our conversation in channel, I think Merkleeyes uses goleveldb for its on-disk storage. There are a few other reports of issues like this: Prometheus hit crash-recovery problems in Fall 2016, Syncthing hit panics around the same time, and there's also a report of disk imaging resulting in "corrupted or incomplete meta file" errors in Spring 2017. GolevelDB's maintainer suggests in those threads that syndtr/goleveldb@1996ac2 and syndtr/goleveldb@69e19a4 may help, so it might be worth upgrading or cherry-picking those commits into Merkleeyes' LevelDB as well.
I also suggest developing a test suite to verify specifically whether Merkleeyes recovers correctly from arbitrary truncations of its various LevelDB files.
The text was updated successfully, but these errors were encountered:
If a Merkleeyes LevelDB file is truncated (e.g. due to power failure or backup-and-restore), Merkleeyes can panic on startup, throwing:
This effectively means power failures etc. on more than 1/3 of a cluster have a chance to render the cluster unusable, at least until you can fix the leveldb recovery code or restore from non-corrupt backups.
I can't see tmlibs, so I'm not exactly sure how this dependency works, but from our conversation in channel, I think Merkleeyes uses goleveldb for its on-disk storage. There are a few other reports of issues like this: Prometheus hit crash-recovery problems in Fall 2016, Syncthing hit panics around the same time, and there's also a report of disk imaging resulting in "corrupted or incomplete meta file" errors in Spring 2017. GolevelDB's maintainer suggests in those threads that syndtr/goleveldb@1996ac2 and syndtr/goleveldb@69e19a4 may help, so it might be worth upgrading or cherry-picking those commits into Merkleeyes' LevelDB as well.
I also suggest developing a test suite to verify specifically whether Merkleeyes recovers correctly from arbitrary truncations of its various LevelDB files.
The text was updated successfully, but these errors were encountered: