Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump state file atomically not to corrupt it #9451

Merged
merged 3 commits into from
Aug 1, 2022
Merged

Conversation

Al2Klimov
Copy link
Member

@Al2Klimov Al2Klimov commented Jul 21, 2022

by using fsync(2) before close(2) and rename(2).

closes #9446

@cla-bot cla-bot bot added the cla/signed label Jul 21, 2022
@Al2Klimov
Copy link
Member Author

Test

Start strace -f icinga2 daemon -z and send a term signal to the umbrella process.

[pid  5637] write(1, "[2022-07-21 18:53:03 +0200] \33[32"..., 126[2022-07-21 18:53:03 +0200] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
) = 126
[pid  5637] openat(AT_FDCWD, "/var/lib/icinga2", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 12
[pid  5637] fstat(12, {st_mode=S_IFDIR|0750, st_size=4096, ...}) = 0
[pid  5637] getdents64(12, 0x5642e03e7580 /* 5 entries */, 32768) = 160
[pid  5637] getdents64(12, 0x5642e03e7580 /* 0 entries */, 32768) = 0
[pid  5637] close(12)                   = 0
[pid  5637] getpid()                    = 5637
[pid  5637] openat(AT_FDCWD, "/var/lib/icinga2/icinga2.state.tmp.wReKTr", O_RDWR|O_CREAT|O_EXCL, 0600) = 12
[pid  5637] openat(AT_FDCWD, "/var/lib/icinga2/icinga2.state.tmp.wReKTr", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 14
[pid  5637] close(12)                   = 0
[pid  5637] chmod("/var/lib/icinga2/icinga2.state.tmp.wReKTr", 0600) = 0
[pid  5637] write(14, "81:{\"name\":\"ido\",\"type\":\"CheckCo"..., 270) = 270
[pid  5637] fsync(14)                   = 0
[pid  5637] close(14)                   = 0
[pid  5637] rename("/var/lib/icinga2/icinga2.state.tmp.wReKTr", "/var/lib/icinga2/icinga2.state") = 0

Awesome! If this doesn’t help... I don’t know.

@Al2Klimov
Copy link
Member Author

Al2Klimov commented Jul 21, 2022

TODO

  • Build error on Windows
  • Test on Windows
  • Packaging: build deps
  • Re-run GHA

@Al2Klimov Al2Klimov self-assigned this Jul 21, 2022
@Al2Klimov Al2Klimov marked this pull request as draft July 21, 2022 17:05
Copy link
Contributor

@julianbrost julianbrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looks like some parts of Boost.Iostreams aren't header-only in particular file_descriptor: https://www.boost.org/doc/libs/1_79_0/libs/iostreams/doc/installation.html

Sounds like a bit of extra fun then, but okay.

lib/base/utility.hpp Outdated Show resolved Hide resolved
lib/base/atomic-file.hpp Show resolved Hide resolved
lib/base/exception.cpp Show resolved Hide resolved
lib/base/utility.hpp Outdated Show resolved Hide resolved
lib/base/utility.hpp Outdated Show resolved Hide resolved
@Al2Klimov
Copy link
Member Author

My compromise implementation:

@Al2Klimov Al2Klimov marked this pull request as ready for review July 26, 2022 14:50
@Al2Klimov Al2Klimov removed their assignment Jul 26, 2022
@Al2Klimov
Copy link
Member Author

[pid  5067] write(1, "[2022-07-26 16:44:27 +0200] \33[32"..., 126[2022-07-26 16:44:27 +0200] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
) = 126
[pid  5067] openat(AT_FDCWD, "/var/lib/icinga2", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 12
[pid  5067] fstat(12, {st_mode=S_IFDIR|0750, st_size=4096, ...}) = 0
[pid  5067] getdents64(12, 0x5608086ab890 /* 5 entries */, 32768) = 160
[pid  5067] getdents64(12, 0x5608086ab890 /* 0 entries */, 32768) = 0
[pid  5067] close(12)                   = 0
[pid  5067] getpid()                    = 5067
[pid  5067] openat(AT_FDCWD, "/var/lib/icinga2/icinga2.state.tmp.bd3lAk", O_RDWR|O_CREAT|O_EXCL, 0600) = 12
[pid  5067] chmod("/var/lib/icinga2/icinga2.state.tmp.bd3lAk", 0600) = 0
[pid  5067] write(12, "81:{\"name\":\"ido\",\"type\":\"CheckCo"..., 270) = 270
[pid  5067] fsync(12)                   = 0
[pid  5067] close(12)                   = 0
[pid  5067] rename("/var/lib/icinga2/icinga2.state.tmp.bd3lAk", "/var/lib/icinga2/icinga2.state") = 0

@Al2Klimov
Copy link
Member Author

In this variant we keep the legacy broken and CreateTempFile() as-is. But we don’t depend on it in new code.

@Al2Klimov Al2Klimov force-pushed the flush-state-file branch 2 times, most recently from f3e5d56 to 637d81a Compare July 26, 2022 15:15
lib/base/atomic-file.cpp Outdated Show resolved Hide resolved
@Al2Klimov Al2Klimov force-pushed the flush-state-file branch 2 times, most recently from 001cbbb to 9299b41 Compare July 28, 2022 11:13
lib/base/atomic-file.cpp Outdated Show resolved Hide resolved
lib/base/exception.hpp Outdated Show resolved Hide resolved
lib/base/atomic-file.cpp Outdated Show resolved Hide resolved
lib/base/atomic-file.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Verify consistency guarantees of file operations on Windows
2 participants