Ticket #1792 (new enhancement)

Opened 2 years ago

Last modified 5 months ago

BuildStep timeout detection does not kill child processes

Reported by: cortana Owned by:
Priority: major Milestone: 0.8.+
Version: 0.8.2 Keywords: kill
Cc:

Description

I have noticed my buildslave machine becoming overloaded several times recently. I believe this is caused by the following sequence of events:

  1. 'make check' is run as part of a build
  2. buildbot sends SIGKILL to the build process because it takes too long
  3. only the top-level process is killed: child processes are not killed, so the test suite continues to run!
  4. buildbot kicks off another build...

The result is 8-9 copies of the test suite from improperly killed-off builds hanging around, until I SSH in and kill all buildslave processes by hand.

Possible solutions:

  • when killing a BuildStep?, issue it a SIGINT, instead of SIGKILL. In my case, this would have allowed make to kill off all child processes properly, as if I had hit Ctrl+C in a terminal.
  • to guard against buggy build systems, however, you probably want to send a SIGINT, then wait 10 seconds, then send a SIGKILL to the buildstep *and all its child processes*. Either by hand, or using some kind of session group magic from POSIX.
  • I believe that in modern Linux kernels, the same can be achieved with 'cgroups'. Each build would go into its own cgroup, and then the buildslave can kill all processes in a cgroup at once.

Workaround: increase 'timeout' property of the 'make check' BuildStep?.

Change History

comment:1 Changed 2 years ago by dustin

  • Keywords kill added
  • Type changed from defect to enhancement
  • Milestone changed from undecided to 0.8.+

Yes, in general, killing is very difficult to get right, particularly across platforms. It's not very configurable right now, and that should be improved.

comment:2 Changed 7 months ago by tom.prince

  • Milestone changed from 0.8.+ to 0.8.8

comment:3 Changed 5 months ago by tom.prince

  • Milestone changed from 0.8.8 to 0.8.+
Note: See TracTickets for help on using tickets.