Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test for zombie processes in crashtracking #3364

Open
wants to merge 83 commits into
base: main
Choose a base branch
from

Conversation

kevingosse
Copy link
Contributor

@kevingosse kevingosse commented Nov 4, 2024

Motivation

https://datadoghq.atlassian.net/browse/APMLP-289 and https://datadoghq.atlassian.net/browse/APMLP-290

Changes

Adds an injection test validating that crashtracking:

  • does not spawn a child process unless there's a crash
  • does not leave zombie processes behind

To look for child processes, the weblog exposes a /child_pids endpoint that browses the /proc folder to find the parent processes. It doesn't rely on ps because it's not available in some containers.
To look for zombie processes, the weblog exposes a /fork_and_crash that spawns a copy of the process (using fork whenever possible, or just manually starting a process on Java and .NET) then have it crash. We need a child process for this because if we crash the main process then the container will be teared down and we won't be able to observe any zombie.

The tests currently fail on Ruby, I think it runs without the crashtracking fix.

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes (if something not related to your task is failing, you can ignore it)
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner. We're working on refining the codeowners file quickly.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • If PR title starts with [<language>], double-check that only <language> is impacted by the change
  • No system-tests internal is modified. Otherwise, I have the approval from R&P team
  • CI is green, or failing jobs are not related to this change (and you are 100% sure about this statement)
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added (or removed)?

@kevingosse kevingosse changed the title Test for zombie processes Test for zombie processes in crashtracking Nov 5, 2024
@emmettbutler emmettbutler self-requested a review November 6, 2024 15:28
Copy link
Collaborator

@robertomonteromiguel robertomonteromiguel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This PR is very large, I would have preferred to have divided it into several parts, for example, by language.
  • There are lint failures
  • There are failures on gitlab
  • I propose not to create a new scenario, but to use the existing scenario “INSTALLER_AUTO_INJECTION”. This new test case will be executed together with other tests. The condition is after breaking the application, restore it.

vm_port = virtual_machine.deffault_open_port
warmup_weblog(f"http://{vm_ip}:{vm_port}/")

def get_child_pids(self, virtual_machine) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add this methods to the weblog interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -651,6 +651,14 @@ def all_endtoend_scenarios(test_object):
github_workflow="libinjection",
)

container_auto_injection_install_script_crashtracking = InstallerAutoInjectionScenario(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I advocate for using the existing scenario "INSTALLER_AUTO_INJECTION" . And after break the app, restore the app

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants