Ticket #395 (closed defect: worksforme)
when i change the vcs executable, buildslave stops being able to invoke it until I restart buildslave
| Reported by: | zooko | Owned by: | zooko |
|---|---|---|---|
| Priority: | major | Milestone: | 0.7.12 |
| Version: | 0.7.9 | Keywords: | |
| Cc: |
Description
It seems like if I run:
cd /usr/local/stow stow -D darcs-2.1.0plus stow darcs-2.1.2
Then all my buildslaves subsequently become unable to execute darcs:
http://allmydata.org/buildbot/builders/zooko%20yukyuk%20hardy/builds/378
The next time I restart those buildslaves, they are able to find it again.
On a hunch I went and put back the original stowed darcs:
cd /usr/local/stow stow -D darcs-2.1.2 stow darcs-2.1.0plus
And forced a build, but the buildslave was still unable to find it, even though now now the exactly same executable is in the same location, being pointed at by a new symlink which is in the same location.
Anyway, I went ahead and restarted that buildslave and now it works again.
Change History
comment:2 Changed 4 years ago by zooko
- Status changed from closed to reopened
- Resolution invalid deleted
GNU stow is very simple -- it just makes a symlink from /usr/local/bin/darcs to /usr/local/stow/darcs-2.2.0/bin/darcs (and similarly for all other files in a package. So from buildbot's perspective, the PATH includes /usr/local/bin, and that dir had no entry named "darcs" or else had a symlink to /usr/local/stow/darcs-2.1.0/bin/darcs at one point, and then at a later point it had a symlink to /usr/local/stow/darcs-2.2.0/bin/darcs. I can't figure out how this can go wrong, either, and I've scouted around in the buildbot source code and can't see it, unless Twisted or Python is doing some bad caching when buildbot invokes the Twisted subprocess utility.
However, it just happened again: builds 36 and 37 couldn't find the "darcs" executable, but build 38 after the buildslave was restarted could. Unfortunately I don't remember what, if anything I changed in the host system before build 36, but I am sure that I didn't change anything in the host system between build 37 and build 38.
http://allmydata.org/buildbot-pycryptopp/builders/linux-amd64-ubuntu-hardy-yukyuk/builds/36 http://allmydata.org/buildbot-pycryptopp/builders/linux-amd64-ubuntu-hardy-yukyuk/builds/37 http://allmydata.org/buildbot-pycryptopp/builders/linux-amd64-ubuntu-hardy-yukyuk/builds/38
The twistd.log from builds #36 and #37 includes:
2009-03-02 17:13:01-0700 [Broker,client] <SlaveBuilder 'linux-amd64-ubuntu-hardy-yukyuk' at 16545480>.startBuild
2009-03-02 17:13:02-0700 [Broker,client] Peer will receive following PB traceback:
2009-03-02 17:13:02-0700 [Broker,client] Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/spread/banana.py", line 146, in gotItem
self.callExpressionReceived(item)
File "/usr/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/spread/banana.py", line 111, in callExpressionReceived
self.expressionReceived(obj)
File "/usr/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/spread/pb.py", line 514, in expressionReceived
method(*sexp[1:])
File "/usr/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/spread/pb.py", line 826, in proto_message
self._recvMessage(self.localObjectForID, requestID, objectID, message, answerRequired, netArgs, netKw)
--- <exception caught here> ---
File "/usr/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/spread/pb.py", line 840, in _recvMessage
netResult = object.remoteMessageReceived(self, message, netArgs, netKw)
File "/usr/lib/python2.5/site-packages/Twisted-8.2.0-py2.5-linux-x86_64.egg/twisted/spread/flavors.py", line 114, in remoteMessageReceived
state = method(*args, **kw)
File "/usr/lib/python2.5/site-packages/buildbot-0.7.9-py2.5.egg/buildbot/slave/bot.py", line 168, in remote_startCommand
self.command = factory(self, stepId, args)
File "/usr/lib/python2.5/site-packages/buildbot-0.7.9-py2.5.egg/buildbot/slave/commands.py", line 667, in __init__
self.setup(args)
File "/usr/lib/python2.5/site-packages/buildbot-0.7.9-py2.5.egg/buildbot/slave/commands.py", line 1633, in setup
self.vcexe = getCommand("darcs")
File "/usr/lib/python2.5/site-packages/buildbot-0.7.9-py2.5.egg/buildbot/slave/commands.py", line 58, in getCommand
raise RuntimeError("Couldn't find executable for '%s'" % name)
exceptions.RuntimeError: Couldn't find executable for 'darcs'
and from build #38 I see:
2009-03-05 16:55:42-0700 [Broker,client] <SlaveBuilder 'linux-amd64-ubuntu-hardy-yukyuk' at 16578392>.startBuild
2009-03-05 16:55:42-0700 [Broker,client] startCommand:darcs [id 46]
2009-03-05 16:55:42-0700 [Broker,client] ShellCommand._startCommand
2009-03-05 16:55:42-0700 [Broker,client] rm -rf /home/pycryptoppbuildslave/yukyuk-linux-hardy-amd64-pycryptopp/linux-amd64-ubuntu-hardy-yukyuk/build
2009-03-05 16:55:42-0700 [Broker,client] in dir /home/pycryptoppbuildslave/yukyuk-linux-hardy-amd64-pycryptopp/linux-amd64-ubuntu-hardy-yukyuk (timeout 1200 secs)
2009-03-05 16:55:42-0700 [Broker,client] watching logfiles {}
2009-03-05 16:55:42-0700 [Broker,client] argv: ['rm', '-rf', '/home/pycryptoppbuildslave/yukyuk-linux-hardy-amd64-pycryptopp/linux-amd64-ubuntu-hardy-yukyuk/build']
2009-03-05 16:55:42-0700 [Broker,client] environment: {'LANG': 'en_US.UTF-8', 'TERM': 'screen', 'SHELL': '/bin/bash', 'LESSCLOSE': '/usr/bin/lesspipe %s %s', 'SHLVL': '1', 'OLDPWD': '/home/pycryptoppbuildslave', 'LESSOPEN': '| /usr/bin/lesspipe %s', 'HISTCONTROL': 'ignoreboth', 'PWD': '/home/pycryptoppbuildslave/yukyuk-linux-hardy-amd64-pycryptopp', 'LOGNAME': 'pycryptoppbuildslave', 'USER': 'pycryptoppbuildslave', 'MAIL': '/var/mail/pycryptoppbuildslave', 'PATH': '/usr/local/bin:/usr/bin:/bin:/usr/games', 'LS_COLORS': 'no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.svgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:', 'HOME': '/home/pycryptoppbuildslave', '_': '/usr/bin/buildbot'}
2009-03-05 16:55:42-0700 [Broker,client] closing stdin
2009-03-05 16:55:42-0700 [Broker,client] using PTY: False
2009-03-05 16:55:43-0700 [-] command finished with signal None, exit code 0, elapsedTime: 0.647294
2009-03-05 16:55:43-0700 [-] ShellCommand._startCommand
2009-03-05 16:55:43-0700 [-] /usr/local/bin/darcs get --verbose --partial --repo-name build http://allmydata.org/source/pycryptopp/trunk-hashedformat
2009-03-05 16:55:43-0700 [-] in dir /home/pycryptoppbuildslave/yukyuk-linux-hardy-amd64-pycryptopp/linux-amd64-ubuntu-hardy-yukyuk (timeout 1200 secs)
2009-03-05 16:55:43-0700 [-] watching logfiles {}
2009-03-05 16:55:43-0700 [-] argv: ['/usr/local/bin/darcs', 'get', '--verbose', '--partial', '--repo-name', 'build', 'http://allmydata.org/source/pycryptopp/trunk-hashedformat']
2009-03-05 16:55:43-0700 [-] environment: {'LANG': 'en_US.UTF-8', 'TERM': 'screen', 'SHELL': '/bin/bash', 'LESSCLOSE': '/usr/bin/lesspipe %s %s', 'SHLVL': '1', 'OLDPWD': '/home/pycryptoppbuildslave', 'LESSOPEN': '| /usr/bin/lesspipe %s', 'HISTCONTROL': 'ignoreboth', 'PWD': '/home/pycryptoppbuildslave/yukyuk-linux-hardy-amd64-pycryptopp', 'LOGNAME': 'pycryptoppbuildslave', 'USER': 'pycryptoppbuildslave', 'MAIL': '/var/mail/pycryptoppbuildslave', 'PATH': '/usr/local/bin:/usr/bin:/bin:/usr/games', 'LS_COLORS': 'no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.svgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:', 'HOME': '/home/pycryptoppbuildslave', '_': '/usr/bin/buildbot'}
2009-03-05 16:55:43-0700 [-] closing stdin
2009-03-05 16:55:43-0700 [-] using PTY: False
2009-03-05 16:55:47-0700 [-] command finished with signal None, exit code 0, elapsedTime: 4.313054
comment:3 Changed 4 years ago by dustin
- Owner set to zooko
- Status changed from reopened to new
- Milestone changed from undecided to 0.7.11
Ah, I wonder if this is a problem with twisted.python.procutils.which (although I don't see any kind of caching in Twisted-8.1.0), or with some kind of caching of stat results in Python (related to the old statcache module)?
Can you do a little experiement with twisted.python.procutils.which to see if, indeed, something is evaluating symlinks and caching the results?
comment:5 Changed 4 years ago by zooko
By the way, this still happening.
zooko@tahoe-slave-edgy:~$ buildbot --version Buildbot version: 0.7.4 Twisted version: 2.4.0
I installed /usr/local/bin/darcs which is a symlink to /usr/local/stow/blahblahblah/bin/darcs-blahblahblah , and then I removed /usr/bin/darcs, and then builds get this exception:
http://allmydata.org/buildbot/builders/edgy/builds/2283/steps/darcs/logs/stdio
File "/usr/lib/python2.4/site-packages/buildbot/slave/bot.py", line 164, in remote_startCommand
self.command = factory(self, stepId, args)
File "/usr/lib/python2.4/site-packages/buildbot/slave/commands.py", line 604, in __init__
self.setup(args)
File "/usr/lib/python2.4/site-packages/buildbot/slave/commands.py", line 1235, in setup
self.vcexe = getCommand("darcs")
File "/usr/lib/python2.4/site-packages/buildbot/slave/commands.py", line 53, in getCommand
raise RuntimeError("Couldn't find executable for '%s'" % name)
exceptions.RuntimeError: Couldn't find executable for 'darcs'
After restarting the buildslave, it worked fine. In answer to your question: well, sure I'd like to experiment with twisted.python.procutils.which sometime, but not right now ...
comment:7 Changed 3 years ago by warner
I see no code to perform caching in Twisted (twisted.python.procutils.which) or in buildbot.
Now, if darcs was the only thing in your stow environment, then /usr/local/bin itself would be a symlink to /usr/local/stow/blabblah/bin, and then you could get tripped up by some bit of code which abspath'ed everything in os.environ["PATH"] at process launch time. If that were happening, then unstowing darcs wouldn't be enough to get /usr/local/stow/blahblah/bin off your PATH, and if darcs were still there, it would try to run it (instead of trying to run /usr/bin/darcs or whatever).
But that neither explains the problem you're seeing nor sounds very likely.
comment:8 Changed 3 years ago by dustin
- Status changed from new to closed
- Resolution set to worksforme
OK, I just tried this out, moving svn from /usr/bin to /bin between builds. Works just fine. So this is something in the Python, Twisted, or Buildbot version you're using. I'm using buildbot master, twisted 8.2.0, and python 2.6.2.
![[Buildbot Logo]](/chrome/site/header-text-transparent.png)
I don't really know how Stow works, but I'm guessing this is some peculiarity of that application. Buildbot just runs the executable via PATH, after all. I'll be interested if you can give some more details -- show logfiles, add debug output, etc.