Ticket #825 (closed defect: fixed)

Opened 21 months ago

Last modified 11 months ago

Zombie builds stuck in FileDownload

Reported by: catlee Owned by:
Priority: critical Milestone: 0.8.+
Version: 0.7.9 Keywords:
Cc: mook...moz+net.buildbot@…

Description

We have slaves running buildbot 0.7.9 attached to a master running 0.7.10. On occasion, a slave will get disconnected from the master while doing a FileDownload? step.

When it reconnects to the master, the master notices a duplicate connection, and attempts to disconnect the old slave. It doesn't manage to stop the old build, however, so you can end up with one slave running old builds for weeks at a time until the master is finally restarted.

It is impossible to Stop Build these old builds.

Change History

comment:1 Changed 21 months ago by dustin

serendipitously, I saw a similar problem here today. Here's what I see in the web interface:

(view as text) Traceback (most recent call last): Failure: twisted.spread.pb.PBConnectionLost: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionDone?'>: Connection was closed cleanly. ]

For unrelated reasons, our buildslave logs for this no longer exist, so that's all I've got.

comment:2 Changed 21 months ago by dustin

  • Milestone changed from undecided to 0.8.1

comment:3 Changed 21 months ago by dustin

  • Priority changed from minor to critical

We'll need more evidence to track this down, I think.

comment:4 Changed 18 months ago by mook

  • Cc mook...moz+net.buildbot@… added

comment:5 Changed 18 months ago by dustin

  • Milestone changed from 0.8.2 to 0.8.3

Hopefully, if this is a slave-side problem, my slave-side tests will tease it out.

comment:6 Changed 14 months ago by ayust

  • Milestone changed from 0.8.3 to 0.8.+

comment:7 Changed 11 months ago by Dustin J. Mitchell

  • Status changed from new to closed
  • Resolution set to fixed

Allow transfer steps to be interrupted

This also collects .finished and .interrupted into a parent class on the slave side. Fixes #825.

Changeset: c8d1ee63f6789d63a97ef39e62e7dd9d9a912562

Note: See TracTickets for help on using tickets.