Ticket #2122 (closed defect: fixed)
Race condition causes error describing step
| Reported by: | tmidd | Owned by: | |
|---|---|---|---|
| Priority: | minor | Milestone: | 0.8.6 |
| Version: | 0.8.5 | Keywords: | |
| Cc: |
Description
Not sure on the details, but it seems that sometimes when the waterfall page is refreshed it causes the following exception in twistd.log (and one of the commands on the waterfall just shows '???'). It seems to be some kind of race condition, since most of the time the same command will display normally and not generate an exception. As suggested in the traceback, the command uses WithProperties?... I don't know how those are actually handled, but at a guess something is meant to parse it into a string before describe is called? Let me know if you need any further info.
2011-10-05 14:06:07+1300 [-] Error describing step
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/buildbot-0.8.5-py2.6.egg/buildbot/process/build.py", line 389, in _stepDone
return self.startNextStep()
File "/usr/local/lib/python2.6/dist-packages/buildbot-0.8.5-py2.6.egg/buildbot/process/build.py", line 378, in startNextStep
d = defer.maybeDeferred(s.startStep, self.remote)
File "/usr/local/lib/python2.6/dist-packages/Twisted-11.0.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py", line 133, in maybeDeferred
result = f(*args, **kw)
File "/usr/local/lib/python2.6/dist-packages/buildbot-0.8.5-py2.6.egg/buildbot/process/buildstep.py", line 514, in startStep
self.step_status.setText(self.describe(False))
--- <exception caught here> ---
File "/usr/local/lib/python2.6/dist-packages/buildbot-0.8.5-py2.6.egg/buildbot/steps/shell.py", line 173, in describe
if len(words) < 1:
exceptions.AttributeError: WithProperties instance has no attribute '__len__'
Change History
comment:2 Changed 17 months ago by seb_kuzminsky
I get this same traceback. It seems to happen when I run a SetProperty?() buildstep that has a WithProperties?() for a command.
I'm on 0.8.5, on Ubuntu Precise 12.04 Alpha, fwiw.
comment:3 follow-up: ↓ 4 Changed 17 months ago by dustin
Does that WithProperties? reference something that the SetProperty? should set?
comment:4 in reply to: ↑ 3 Changed 17 months ago by seb_kuzminsky
Replying to dustin:
Does that WithProperties? reference something that the SetProperty? should set?
I have two SetProperty?() calls that have trouble; one tries to use the property it sets and one doesn't.
Here's the one that uses its own property:
set_property_branch = shell.SetProperty(
command = WithProperties('if [ ! -z "%(branch)s" ]; then echo %(branch)s; else echo master; fi'),
property = 'branch',
haltOnFailure = 1
)
And here's the one that doesn't:
def get_version_extractor(rc, stdout, stderr):
s = stdout.strip()
if s == '':
return None # how do you report error from an extract_fn?
return { 'version': s }
set_property_version = shell.SetProperty(
command = WithProperties("scripts/get-version-from-git %(branch)s"),
extract_fn = get_version_extractor,
haltOnFailure = 1
)
comment:5 Changed 17 months ago by seb_kuzminsky
Here is my master config, and two relevant twistd.logs: http://highlab.com/~seb/buildbot-2122/
At 2012-01-13 09:00:47-0700 in twistd.log.2, there's a traceback from the first SetProperty?().
Then at 2012-01-13 09:01:44-0700 in the same log there's a traceback from the second SetProperty?().
comment:6 Changed 17 months ago by dustin
Sorry, my question in comment 3 was bogus -- it doesn't matter.
I think that the fix for this is to note when the step is rendered, and then use self.build.render in describe if the command hasn't been rendered yet. This may result in incorrect descriptions while the build is still running, but I think that's an acceptable problem -- what else could we do?
![[Buildbot Logo]](/chrome/site/header-text-transparent.png)
I'm guessing that this is happening before the step is run, so its properties haven't been rendered yet. It's pretty tricky to determine whether self.command needs to be rendered (since the renderables may be buried in a list).
This is relatively harmless, as you've seen - the waterfall renders, albiet with a traceback in the logfile.
Still, I bet we can find a fix -- describe() should be resilient.