Opened 8 years ago

Last modified 5 years ago

#2019 reopened defect

graceful shutdown and triggered builds don't play as expected (0.8.4p1)

Reported by: dberger Owned by:
Priority: major Milestone: 0.9.+
Version: 0.8.4p1 Keywords:


we have several builders that act as coordinators for a set of triggered builds (so that binaries across platforms can be submitted in a single changelist).

we noticed that if you hit graceful shutdown after the coordinator has started, but before it triggers the sub-builds, you're stuck - the sub-builders stay pending, and the triggering builder never proceeds, so you never shut down, but you stop doing work.

Attachments (1)

Selection_029.png (28.4 KB) - added by buck 6 years ago.
necessary behavior

Download all attachments as: .zip

Change History (11)

comment:1 Changed 8 years ago by dustin

  • Milestone changed from undecided to 0.8.+

Hm, how would we fix this? It's essentially a deadlock.

I think that the "right" answer is to run the triggered builds anyway, but that's pretty difficult.

comment:2 Changed 8 years ago by dberger

I agree that the right answer seems to be anyone in before the shutdown button should finish - but the "easier" answer may be to have the trigger recognize that it can't trigger builds and fail.

comment:3 Changed 8 years ago by dustin

I could implement that relatively quickly - maybe for 0.8.5, and then try to do the more correct option later.

comment:4 Changed 7 years ago by dustin

Actually, the fix for this is #1039 -- once that's in place, the triggering should be persistent, and can be picked up after restart.

comment:5 Changed 6 years ago by dustin

  • Resolution set to duplicate
  • Status changed from new to closed

And a triggering build *isn't* finished until its triggered builds are complete, so it shouldn't allow the clean shutdown to finish. So there's nothing to fix here, in favor of #1039.

comment:6 Changed 6 years ago by buck

I've hacked up a solution that (apparently) fixes this issue for 0.8.3:

I wouldn't say this issue is duplicate, but rather blocked on #1039; even after 1039 is implemented, there will need to be work done to verify that this issue is in fact closed.

comment:7 Changed 6 years ago by buck

I'll submit a well-tested patch for this issue versus buildbot master this week, if the maintainers will welcome it. I'm aware that you all would prefer a patch versus the "nine" branch, but I need to patch a branch to which we can actually upgrade; this is a production problem for us.

Last edited 6 years ago by buck (previous) (diff)

comment:8 Changed 6 years ago by buck

  • Resolution duplicate deleted
  • Status changed from closed to reopened

I plan to provide a patch for this, independent of the status of #1039.

Changed 6 years ago by buck

necessary behavior

comment:9 Changed 6 years ago by buck

In short, if a graceful-shutdown command is sent to a buildbot, which then runs a trigger-and-wait step, that buildbot cluster will then cease to do anything useful indefinitely.

The deadlock consists of three blocking parts:

  1. The graceful-shutdown command blocks, waiting for all builders to finish.
  2. The trigger-and-wait step blocks, waiting for its triggered builds to finish.
  3. The triggered build blocks, waiting for the graceful-shutdown to complete before *starting* itself.

My essential strategy is to edit part 3. We'll make sure that builds which have steps blocking on them are allowed to run, even if we're in the process of a graceful shutdown.

This image demonstrates the necessary behavior. When the "clean shutdown" command comes in, queued builds and non-blocking triggered builds pend until after the master restarts, while blocking triggered builds are allowed to run.

necessary behavior

This image was generated on buildbot 0.8.3 using this patch:

I plan to provide a similar patch (in spirit if not implementation) and integration tests for buildbot/master branch (0.8.8).

comment:10 Changed 5 years ago by dustin

  • Milestone changed from 0.8.+ to 0.9.+

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.