-
-
Notifications
You must be signed in to change notification settings - Fork 148
electron-builder: Fix race condition when preparing to copy binaries #1137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
electron-builder: Fix race condition when preparing to copy binaries #1137
Conversation
There was a race condition where two separate pieces of async code would A) create the 'binaries' folder, and B) copy binaries to that folder. The problem was the lack of a guarantee they would happen in that order! If the attempt to copy files happened before the directory got created, the whole electron-builder run would error and/or stall out from unsuccessful auto-retries failing. To fix the situation: Ensure the 'binaries' dir exists *before* copying binaries to it. Do so by making the directory *first* with a *synchronous* function. (I might usually try to eliminate race conditions while still using asynchronous code, but for simple scripts, I prefer to use synchronous code, for simplicity's sake and for reliability.) In other words: Can't copy files to a destination dir that doesn't exist! This makes sure the directory exists *first*, then copy. Also: Log any error we hit while making the dir, but try to continue. (For anyone who has successfully run electorn-builder once in the repo, they already have a 'binaries' folder, so we can ignore any error when trying to "make the dir" that already exists. If it doesn't exist, the copy operation will print another relevant error message. So we don't need to error early here. But logging it should help if this issue needs any troubleshooting on anyone's machines (including CI) any time in the future.)
try { | ||
mkdirSync('binaries', {recursive: true}) | ||
} catch (err) { | ||
console.warn("Warning: error encountered when making the 'binaries' dir.") | ||
console.warn("(HINT: If the 'binaries' folder already exists, then this error message is probably fine to ignore!)") | ||
console.warn(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the try/catch, could we use existsSync
to help us decide whether the binaries
directory needs to be created in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mkdirSync
is already set to recursive: true
which doesn't complain if the dir already exists.
I'm anticipating some other exotic error condition might be possible, at which point I'd want to catch it, log it, and keep going.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, gotcha. In that case it strikes me as a little bit belt-and-suspenders, but I suppose there's nothing wrong with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I wasn't sure if the try/catch is overkill. Optimistically, it shouldn't be needed. But pessimistically, maybe it might?
(And try/catch makes the code more verbose to read or skim, but IMO its "performance cost" is a non-issue for how infrequent a code path this is to hit. Compiling code is many orders of magnitude slower than running a try/catch.)
I think I was motivated to do it this way because the output can be extremely vague if not explicitly logging things. For example the output for this very PR:
[ . . . ]
• building target=DMG arch=x64 file=dist/Pulsar-1.122.2024111920.dmg
• building block map blockMapFile=dist/Pulsar-1.122.2024111920-mac.zip.blockmap
• Above command failed, retrying 5 more times
• Above command failed, retrying 4 more times
• Above command failed, retrying 3 more times
Error: Timeout of 2700000ms hit
https://github.com/pulsar-edit/pulsar/actions/runs/11921198561/job/33224800808?pr=1137#step:18:43
Hard to guess what is going wrong here. (I did confirm in local testing that console.log()
, console.warn()
, and console.error()
will print what you try and print.)
Dang... while this solves an issue for me locally, it's not solving the particular issue for macOS CI. Still a bit of a mystery to me. |
Thanks for the review. Will merge soon. I'm investigating the CI failure still... I know the blockmaps are made with 7-zip, at least on macOS x64. ( |
Problem
There was a race condition where two separate pieces of async code would A) create the 'binaries' folder, and B) copy binaries to that folder. The problem was the lack of a guarantee they would happen in that order! If the attempt to copy files happened before the directory got created, the whole electron-builder run would error and/or stall out from unsuccessful auto-retries failing.
Stakes
This issue blocks the x64 macOS CI "build Pulsar binaries" job from producing any binaries.
(Optional commentary: For unknown reasons, macOS CI and my local machine seem to be adversely affected by the race condition, whereas at least CI Linux and Windows x64, and possibly both Cirrus (ARM) CI tasks, do not appear affected, again for unknown reasons. But as race conditions are unpredictable and might depend on the minutiae of the system the code runs on, this arguably isn't all that surprising...)
Solution
To fix the situation:
Ensure the 'binaries' dir exists before copying binaries to it.
Do so by making the directory first with a synchronous function.
(I might usually try to eliminate race conditions while still using asynchronous code, but for simple scripts, I prefer to use synchronous code, for simplicity's sake and for reliability.)
In other words: Can't copy files to a destination dir that doesn't exist! This makes sure the directory exists first, then copy.
Also: Log any error we hit while making the dir, but try to continue. (For anyone who has successfully run electorn-builder once in the repo, they already have a 'binaries' folder, so we can ignore any error when trying to "make the dir" that already exists. If it doesn't exist, the copy operation will print another relevant error message. So we don't need to error early here. But logging it should help if this issue needs any troubleshooting on anyone's machines (including CI) any time in the future.)
Verification Process
Tested extensively on my local machine (macOS 10.15.7). I would like to see a passing "build Pulsar binaries" CI run as well.
Looked up some info to refresh my memory on promise/async execution and the Node
fs
andfs/promises
APIs. Decided to lean into synchronous code for this, but the code seems logically like it should reliably work now, in the intended order. At least to me, review welcome.Long writeup, simple and sweet diff!
Review would be much appreciated!
(Sidebar: This is another one of those PRs with a simple fix that I wanted to write up in detail so we can understand the issue quickly if it comes up again. But also because I like going a bit too in-depth fixing and writing up these things, makes me feel like a detective. It's therapeutic. Anyhow, sorry for the delay in getting this out. But I think it's rock solid now! Hopefully!)