Ticket #931 (closed enhancement: duplicate)
SIGKILL doesn't kill children processes
| Reported by: | acanis | Owned by: | |
|---|---|---|---|
| Priority: | major | Milestone: | 0.8.+ |
| Version: | 0.8.0 | Keywords: | kill |
| Cc: |
Description
When buildbot sends a SIGKILL to the buildslave processes (either form the webpage or from a timeout) the children processes are left orphaned.
I run into this problem when I run long 1h modelsim simulations. If you leave these children around they take hours to finish.
sh,16133 -c runtest40../dejagnu/*.exp
└─expect,16134 -- /usr/share/dejagnu/runtest.exp ../dejagnu/jpeg.exp
├─make,16185 v │ └─sh,16211 -c vsim40-note40200940-c40-do40"run407000000000000000ns;40exit;"40work.main_tb │ └─vish,16212 -- -vsim -note 2009 -c -do run407000000000000000ns;40exit; work.main_tb │ ├─vlm,16220 652114806 814696814 │ │ └─mgls_asynch,16221 -f6,10 │ └─vsimk,16224 -port 40073 -stdoutfilename /tmp/VSOUTH6KLFK -note 2009 -c -do run407000000000000000ns;40exit; work.main_tb └─{expect},16154
Change History
comment:2 Changed 3 years ago by dustin
- Type changed from undecided to enhancement
- Milestone changed from undecided to 0.8.+
This has remained a major, unsolved problem for a long time, because there really is no complete solution. Iterating over children requires some access to the kernel's process table, and even then will miss "daemonized" children (those whose parents have already exited). There's an ordering constraint, too - if Buildbot makes a list of all processes to kill before killing them, then new processes may be spawned in the interim.
The code is in slave/buildslave/runprocess.py, if you want to take a look.
![[Buildbot Logo]](/chrome/site/header-text-transparent.png)
Could you loop over all children processes and send them all SIGKILL?
If you point me to the right place in the code I can send you a patch.