Skip to content

Fix zombie process accumulation from git operations in cloud environments #1419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nsingl00
Copy link

@nsingl00 nsingl00 commented Jul 16, 2025

Problem

Fixes #975 and #1418

Git commands spawn helper processes (git-credential-helper, git-remote-https, ssh, git-upload-pack, git-receive-pack) that can become zombie processes when the main git process exits before its children complete. This is particularly
problematic in cloud/container environments where:

  • The application runs as PID 1 or under minimal init systems
  • Standard init process zombie reaping may be unreliable or slow
  • Resource constraints can cause zombie accumulation over time

Root Cause

In containerized environments, our Jupyter application often becomes PID 1 due to exec usage in startup scripts, making it responsible for zombie process cleanup. Unlike robust init systems (systemd, launchd) found in local environments,
minimal container init systems may not reliably reap orphaned git helper processes.

Solution

Added a SIGCHLD signal handler that automatically reaps zombie processes system-wide. The handler:

  • Uses non-blocking waitpid(-1, os.WNOHANG) to reap any zombie children
  • Runs whenever any child process terminates (SIGCHLD signal)
  • Prevents zombie accumulation without affecting normal git operations
  • Logs reaped processes at debug level for monitoring

Testing

  • Verified zombie processes are eliminated in cloud environments
  • Confirmed normal git operations continue to work correctly
  • No performance impact on git command execution

Copy link

Binder 👈 Launch a Binder on branch nsingl00/jupyterlab-git/subprocess

@nsingl00
Copy link
Author

Hey @krassowski can you help review the PR?

@ellisonbg
Copy link
Contributor

Thanks for working on this @nsingl00 - wonder if this is related to #975

@nsingl00 nsingl00 changed the title Fix git subprocess termination issues to prevent process accumulation Fix zombie process accumulation from git operations in cloud environments Jul 24, 2025
  Git commands spawn helper processes (git-credential-helper, git-remote-https, ssh)
  that become zombies when the main git process exits before children complete. This
  is problematic in cloud/container environments where the application runs as PID 1
  or under minimal init systems that don't reliably reap orphaned processes.

  Added SIGCHLD signal handler to automatically reap zombie processes system-wide
  using non-blocking waitpid(), preventing resource leaks without affecting normal
  git operations.
@nsingl00
Copy link
Author

Thanks for working on this @nsingl00 - wonder if this is related to #975

yes, it is related. The problem is , when in enterprise we run jupyter in a container environment where process ID 1 is not signal handler processes like tini, zombie process cleanup doesn't happen. Adding an extra SIGHandler to reap off the child processes in those scenarios would work. Added the PR for those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants