Ticket #118 (closed defect: fixed)

Opened 6 years ago

Last modified 6 years ago

Shutting down a buildmaster with a builder that's waiting for a slavelock corrupts the builder (css_classes[None])

Reported by: Ukyo81 Owned by: warner
Priority: major Milestone: 0.7.7
Version: 0.7.6 Keywords:
Cc: aaron_hsieh@…

Description

Conditions:

A buildmaster must be configured to allow for slavelocks with 2 builders configured to use the same slave and slavelock. Both builds should take a decent amount of time so that the buildmaster can be shutdown/restarted in time to reproduce this bug.

Steps to reproduce:

  1. Force a build on both builders so that one builder should be waiting for the slaveLock in order to start building.
  1. On the box hosting the buildmaster, invoke a 'buildbot restart <path to buildmaster directory>'.
  1. When the buildmaster comes back up, in a browser, navigate to the status page of the builder that was waiting for the slaveLock.

Results:

The webStatus page generates an error as follows:

/usr/local/lib/python2.4/site-packages/buildbot/status/web/base.py, line 316 in make_line
314      rev = "version is too-long"
315    root = self.path_to_root(req)
316    values = {'class': css_classes[results],
317         'builder_name': builder_name,
Locals
self	
buildbot.status.web.builder.StatusResourceBuilder instance @ -0x49dd2734 <buildbot.status.web.builder.StatusResourceBuilder instance at 0xb622d8cc>
req	
twisted.web.server.Request instance @ -0x494d2e74 <GET /builders/Sys HTTP/1.1>
rev	'??'
results	None
builder_name	'Sys'
root	'../'
Globals
path_to_root	
function path_to_root in file /usr/local/lib/python2.4/site-packages/buildbot/status/web/base.py at line 112
css_classes	
Dictionary instance @ -0x494fe9fc
0	'success'
1	'warnings'
2	'failure'
4	'exception'

Change History

comment:1 Changed 6 years ago by Ukyo81

Also, this problem can be fixed by going into the builder's log directory, finding the build number that has no logs to go with it (i.e., build number 69 has no stdio files in the directory to go along with it), and deleting the file that corresponds to the interrupted build.

comment:2 Changed 6 years ago by warner

  • Owner set to warner
  • Status changed from new to assigned
  • Milestone changed from undecided to 0.7.7

Ah, it looks like the 'results' was equal to 'None', since the build never finished, and that the web code is unable to handle that.

As a quick patch, try changing the line in buildbot/status/web/base.py which reads:

values = {'class': css_classes[results],

to instead say:

values = {'class': css_classes.get(results, "")],

I think that should clear up the problem, and it probably the best long-term solution. Let me know if that works, and I'll commit it to trunk.

comment:3 Changed 6 years ago by Ukyo81

That patch didn't seem to work at all. I'm still getting the same behavior from this bug.

comment:4 Changed 6 years ago by Ukyo81

  • Cc aaron_hsieh@… added

comment:5 Changed 6 years ago by warner

  • Summary changed from Shutting down a buildmaster with a builder that's waiting for a slavelock corrupts the builder to Shutting down a buildmaster with a builder that's waiting for a slavelock corrupts the builder (css_classes[None])

comment:6 Changed 6 years ago by gward

I just hit this bug too. My cheezy patch, which seems to work, is:

--- base.py.orig 2007-09-30 22:48:39.000000000 -0400 +++ base.py 2007-10-23 16:24:04.712556089 -0400 @@ -301,7 +301,7 @@

def make_line(self, req, build, include_builder=True):

builder_name = build.getBuilder().getName()

  • results = build.getResults()

+ results = build.getResults() or SUCCESS # is getResults() allowed to return None???

text = build.getText() try:

rev = build.getProperty("got_revision")

I'll try Brian's patch now.

comment:7 Changed 6 years ago by gward

Argh. Silly software mangled my patch. Here it is again, I hope:

--- base.py.orig        2007-09-30 22:48:39.000000000 -0400
+++ base.py     2007-10-23 16:24:04.712556089 -0400
@@ -301,7 +301,7 @@

     def make_line(self, req, build, include_builder=True):
         builder_name = build.getBuilder().getName()
-        results = build.getResults()
+        results = build.getResults() or SUCCESS # is getResults() allowed to return None???
         text = build.getText()
         try:
             rev = build.getProperty("got_revision")

comment:8 Changed 6 years ago by gward

Brian, your patch didn't work because there's another css_classes[result] a few lines later. Here's a patch that does the trick for me:

--- buildbot/status/web/base.py.orig    2007-09-30 22:48:39.000000000 -0400
+++ buildbot/status/web/base.py 2007-10-23 16:51:59.257816209 -0400
@@ -313,10 +313,11 @@
         if len(rev) > 20:
             rev = "version is too-long"
         root = self.path_to_root(req)
-        values = {'class': css_classes[results],
+        css_class = css_classes.get(results, "")
+        values = {'class': css_class,
                   'builder_name': builder_name,
                   'buildnum': build.getNumber(),
-                  'results': css_classes[results],
+                  'results': css_class,
                   'text': " ".join(build.getText()),
                   'buildurl': (root +
                                "builders/%s/builds/%d" % (builder_name,

This is better than the patch I posted above, since my attempt inaccurately makes it look like the aborted bug was a success: bad.

comment:9 Changed 6 years ago by Ukyo81

Hmm, seems like we have a winner in the last one. Confirmed.

comment:10 Changed 6 years ago by warner

  • Status changed from assigned to closed
  • Resolution set to fixed

Applied, in [e6baba2af2b016930274e828e7839a5bf7d5b4dc]. Thanks!

-Brian

Note: See TracTickets for help on using tickets.