Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Prevent fork bomb on Windows #1761

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

chawyehsu
Copy link
Contributor

Fix #1741

Before this patch, volta-shim(and its aliases) spawns itself in some cases causing recursive call (fork bomb). This is a critical issue but only occurs on Windows.

chawyehsu added 5 commits May 23, 2024 12:52
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
@charlespierce
Copy link
Contributor

Thanks for tackling this! Can you comment on the specific trigger condition and the mechanics of how it recurses? Without a good understanding of the overall root cause of the recursion, it's difficult to be confident that we're completely resolving the issue.

We're already working around some of the oddities of Command on Windows a bit by calling cmd /C <tool> instead of <tool> directly (see the behavior of the create_command function). I have a hypothesis that the underlying issue is when the Volta directory itself is the CWD for the command (which I think would be the case when you launch it from the file explorer). So even though we remove the Volta directory from the Path, it's still the "local" executable and is first place Windows looks to resolve the name.

If that's the case, we may be able to resolve this in a less intrusive way by detecting the CWD is one of the Volta dirs. I worry a little about the extra copying / allocating needed to hold onto the command, args, and env vars through the process. It's a relatively small impact, but this block is on the hot path for every call to any of the shims, so it's one of the most performance-sensitive areas of the app.

@charlespierce
Copy link
Contributor

One thought, if the above is accurate for how this recursion starts, would be to check on Windows right before executing if the CWD is one of the Volta directories. If it is, we can do the absolute path resolution and then copy the values (since we’ll need to create a new Command at that point regardless). Helpfully, Command itself has methods to get the root command, the arguments, and the environment variables, so we don’t need to copy ahead of time, we can read them from the already-built command to build a new one (see e.g. get_args)

@chawyehsu
Copy link
Contributor Author

chawyehsu commented May 23, 2024

We're already working around some of the oddities of Command on Windows a bit by calling cmd /C <tool> instead of <tool> directly (see the behavior of the create_command function). I have a hypothesis that the underlying issue is when the Volta directory itself is the CWD for the command (which I think would be the case when you launch it from the file explorer). So even though we remove the Volta directory from the Path, it's still the "local" executable and is first place Windows looks to resolve the name.

I see that it had tried to remove the Volta directory from the Path via System::path(), but that's not enough. The udnerlying CreateProcessW searches hard-coded locations before the PATH env:

If the file name does not contain a directory path, the system searches for the executable file in the following sequence:

  1. The directory from which the application loaded.
  2. The current directory for the parent process.
  3. The 32-bit Windows system directory. Use the GetSystemDirectory function to get the path of this directory.
  4. The 16-bit Windows system directory. There is no function that obtains the path of this directory, but it is searched. The name of this directory is System.
  5. The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
  6. The directories that are listed in the PATH environment variable. Note that this function does not search the per-application path specified by the App Paths registry key. To include this per-application path in the search sequence, use the ShellExecute function.

That's the root cause.

we may be able to resolve this in a less intrusive way by detecting the CWD is one of the Volta dirs.

I get your thought here but it does not really help imo by just detecting CWD. It should only find the right executable by only checking from the PATH environment variable we give. (and this is why which crate was introduced)

Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
@chawyehsu
Copy link
Contributor Author

Btw, I don't quite understand why the infinite recursive call is allowed. I believe there should be a limit of max recursive depth, say 10, and once it exceeds it should halt. This should help to resolve from another perspective.

@chawyehsu chawyehsu marked this pull request as draft May 24, 2024 03:25
chawyehsu added 4 commits May 24, 2024 14:22
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
@chawyehsu chawyehsu marked this pull request as ready for review May 24, 2024 08:07
@chawyehsu chawyehsu changed the title fix: fix fork bomb on Windows fix: Prevent fork bomb on Windows May 24, 2024
Copy link
Contributor

@chriskrycho chriskrycho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s a first pass review with some changes requested. @charlespierce may have some more insight on the Windows-specific bits!

crates/volta-core/src/run/executor.rs Outdated Show resolved Hide resolved
crates/volta-core/src/run/executor.rs Outdated Show resolved Hide resolved
crates/volta-core/src/run/executor.rs Outdated Show resolved Hide resolved
crates/volta-core/src/command.rs Outdated Show resolved Hide resolved
crates/volta-core/src/command.rs Outdated Show resolved Hide resolved
crates/volta-core/src/error/kind.rs Outdated Show resolved Hide resolved
tests/acceptance/volta_bypass.rs Show resolved Hide resolved
if let Ok(mut paths) = which::which_in_global(name, Some(&path)) {
if let Some(exe) = paths.next() {
let mut new_command = create_command(exe);
let envs = command
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why does this need to filter the environment variables like this, rather than maintaining the same environment variables as the previous version of the command?
  2. Can you add a comment explaining that here? That will help us remember (and other contributors understand)!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on adding a comment!

I believe this needs to filter because the output of Command::get_envs includes None values for variables that were explicitly removed. I don't think we do any explicit removing of environment variables currently, but possibly what we want to do is mirror the same behavior:

for (key, maybe_value) in command.get_envs() {
    match maybe_value {
        Some(value) => command.env(key, value),
        None => command.env_remove(key),
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the output of Command::get_envs includes None values for variables that were explicitly removed.

This was the reason to filter out envs, envs with None values should not be added to the new_command that's built.

Do you think it's better to use

for (key, maybe_value) in command.get_envs() {
    match maybe_value {
        Some(value) => command.env(key, value),
        None => command.env_remove(key),
    }
}

to apply envs to the new_command or just leave the current implementation?

let envs = command
    .get_envs()
    .filter_map(|(k, maybe_v)| Some(k).zip(maybe_v))
    .collect::<Vec<_>>();
new_command.envs(envs);

crates/volta-core/src/command.rs Outdated Show resolved Hide resolved
crates/volta-core/src/command.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@charlespierce charlespierce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuing on this! A few additional requests for cleanup.

Separately, and I haven't totally thought this through, but would it be unreasonable to directly error if one of the commands is run from either of the Volta directories? That might simplify the solution a lot, at the expense of making some specific uses errors when they otherwise wouldn't. However, those cases are currently fork bombs, so it would be an improvement even if it isn't a complete fix.

// args_idx 0 1 2..
let name = args.get(1).expect("should have the name");

if let Ok(mut paths) = which::which_in_global(name, Some(&path)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the trigger for this (as I understand it) is the cmd /c implementation searching the CWD for the binary, even when it's not explicitly on the Path, would it be unreasonable to gate this reformulation of the Command on that, rather than doing it every time on Windows?

It's relatively minor, but the which_in_global is going to necessarily do a decent chunk of File I/O for every command run, which seems excessive for something on the hot path.

crates/volta-core/src/command.rs Outdated Show resolved Hide resolved
Comment on lines +1252 to +1255
ErrorKind::RecursionLimit => write!(
f,
"Recursive call limit reached."
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great suggestion on the top of my head, but is it impossible to create a call-to-action for this error? We have generally tried to always have a CTA included with any error message, so that we guide users to how to fix the problems they run into.

tests/acceptance/volta_bypass.rs Show resolved Hide resolved
chawyehsu and others added 10 commits August 1, 2024 22:16
Co-authored-by: Chris Krycho <hello@chriskrycho.com>
Co-authored-by: Chris Krycho <hello@chriskrycho.com>
Co-authored-by: Chris Krycho <hello@chriskrycho.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
Signed-off-by: Chawye Hsu <su+git@chawyehsu.com>
@chawyehsu chawyehsu closed this Aug 4, 2024
@chawyehsu chawyehsu reopened this Aug 4, 2024
@chawyehsu chawyehsu mentioned this pull request Aug 5, 2024
@charlespierce
Copy link
Contributor

Doing some spelunking in our issues (while investigating some other, ostensibly unrelated, Windows issues), and I've come around on this approach. I actually think we might be able to do even better and use the behavior of which to completely skip needing to add cmd.exe /C entirely. It's been ~5 years, but looking back at some of our issues (notably #577), I think the reason we use cmd.exe /C currently is twofold:

  1. To work around issues with Rust's Command implementation around calling .cmd and .bat files without extensions.
  2. To prevent fork bombs like this when calling a package that happens to not exist

Bypassing cmd.exe entirely, and using which directly to resolve the executable would solve both of those problems - which already handles detecting the correct file extension and returning it if it's not found.

Additionally, this would solve the linked issue around error messages, because if we can't locate the executable with which, then we can know (even without needing to try to spawn the process) that it won't work and can show a helpful error message instead if a generic cmd.exe node is not a recognized command... error.

It also (I think) would solve argument parsing issues like those mentioned in #1791, so this may be an altogether cleaner path forward.

Thanks for suggesting and pushing on this @chawyehsu!

@chawyehsu
Copy link
Contributor Author

Thanks for the informative response, would love to see improvement to this PR to make it a general solution to linked issues. @charlespierce

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential fork bomb
3 participants