Ticket #751 (reopened enhancement)

Opened 3 years ago

Last modified 3 months ago

Sending SIGTERM before SIGKILL to a remote shell command that has timed out

Reported by: Fabrice Owned by:
Priority: minor Milestone: 0.8.+
Version: 0.7.12 Keywords: kill, sprint
Cc:

Description

I have a test step that does not produce output for one hour (the test or one of its subtest hangs for some reason). My buildbot is configured to timeout this step/command after 3600 seconds of inactivity on stdout or stderr. Thus buildbot sends correctly, as expected, a signal SIGKILL(9) to it and writes in the log:

command timed out: 3600 seconds without output, killing pid <PID>
process killed by signal 9

However, my problem is the following. There is no way for me to catch/trap SIGKILL(9) in my test step process running on the slave and thus, I am missing test logs. Is it possible to make buildbot send a couple of SIGTERM(15) signals before sending a SIGKILL(9) signal?

Change History

comment:1 Changed 3 years ago by dustin

  • Status changed from new to closed
  • Resolution set to wontfix

The buildbot timeout is more of a system-stability thing than an expected-behavior thing, so it goes for the kill immediately. In other words, your testing regime should not depend on this behavior.

You could run your scripts in a wrapper that kills them "gently" after 1h, and bump the buildbot timeout up to 1.5h. You could also probably hack the buildslave to send the SIGTERMs, if you'd like.

comment:2 Changed 3 years ago by eric@…

I wanted this behavior today.

I'm attempting to debug why our test script hangs randomly on one builder. I guess I'll have to write the wrapper as suggested. Would be nice if buildbot would just send SIGTERM followed by SIGKILL, even in rapid succession. That would allow me to print the stack traces of my multi-threaded python program at time of termination instead of having it just die.

comment:3 Changed 2 years ago by dustin

  • Keywords kill added
  • Status changed from closed to reopened
  • Resolution wontfix deleted

comment:4 Changed 2 years ago by ayust

  • Milestone changed from undecided to 0.8.+

comment:5 Changed 3 months ago by dustin

  • Keywords kill, sprint added; kill removed
Note: See TracTickets for help on using tickets.