Ticket #2337 (closed enhancement: worksforme)

Opened 10 months ago

Last modified 6 months ago

gitpoller fails to fetch commit following twisted errors

Reported by: Clement Owned by:
Priority: major Milestone: 0.8.+
Version: 0.8.6p1 Keywords: git gitpoller
Cc: clement.demongeot@…

Description

On my environnement, the getProcessOutput command of twistd sometimes fails with an OSError exception (Errno 11 or 513). The exception is catched by buildbot and the polling fails. The problem is that the git repository is already updated and that the commit will be then ignored by the next buildbot pollings. This is particulary annoying when several commits are pushed in the same time and that the exception occurs in the middle, resulting in the second half of the commit set not being seen by buildbot.

Would it be possible to retry the calls to getProcessOutput when exception occurs or if gitpoller fails, to move the HEAD of the repository to the HEAD of the last successful polling.

Change History

comment:1 Changed 10 months ago by tom.prince

  • Status changed from new to closed
  • Resolution set to fixed

The GitPoller in master has been rewritten to handle this properly.

comment:2 Changed 7 months ago by Clement

  • Cc clement.demongeot@… added
  • Status changed from closed to reopened
  • Resolution fixed deleted

The bug is still present in the 8.7 release.

NB: I cannot attach a logfile because it is rejected by the anti-spam.

comment:3 Changed 7 months ago by tom.prince

If you are having gitpoller issues with 0.8.7, it is going to be a different bug, since the code has been totally changed.

I don't have anywhere near enough information to diagnose, let alone fix this.

comment:4 Changed 7 months ago by dustin

Sorry about the uploading problems - for some reason Akismet thinks the content itself is spam. We have pretty bad spam problems here, so the dials are turned up pretty high. If you upload the logs somewhere -  http://pastebin.mozilla.org is an option - and point me to them, I can attach them here. Sorry about that.

comment:5 Changed 7 months ago by Clement

Logfile is here: it is a part of the twistd.log file:  http://pastebin.mozilla.org/1862758

I don't know if there are any ways I can make this log more verbose.

comment:6 Changed 7 months ago by dustin

short enough to include here..

2012-10-08 16:00:43+0200 [-] gitpoller: processing 3 changes: ['01c238e049c71fbd212994af1158ae64d525953a', 'cc29fa99d708fbb7639a8b4a7bb2aa5fbaa8240f', 'f83bfb5226ad39d18599895ae35e9de60f437653'] from "git@git:my_project"
2012-10-08 16:00:43+0200 [-] checking for User Object from git Change for: Flo <flo>
2012-10-08 16:00:44+0200 [-] added change Change(revision=u'01c238e049c71fbd212994af1158ae64d525953a', who=u'Flo<flo>', branch=u'master', comments=u'commit1', when=1349704649, category=None, project=u'my_project', repository=u'git@git:my_project', codebase=u'') to database
2012-10-08 16:00:44+0200 [-] checking for User Object from git Change for: Flo <flo>
2012-10-08 16:00:44+0200 [-] added change Change(revision=u'cc29fa99d708fbb7639a8b4a7bb2aa5fbaa8240f', who=u'Flo <flo>', branch=u'master', comments=u'commit2', when=1349704650, category=None, project=u'my_project', repository=u'git@git:my_project', codebase=u'') to database
2012-10-08 16:00:44+0200 [-] trying to poll branch master of git@git:my_project
        Traceback (most recent call last):
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 551, in _runCallbacks
            current.result = callback(current.result, *args, **kw)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 1101, in gotResult
            _inlineCallbacks(r, g, deferred)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 1043, in _inlineCallbacks
            result = result.throwExceptionIntoGenerator(g)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/python/failure.py", line 382, in throwExceptionIntoGenerator
            return g.throw(self.type, self.value, self.tb)
        --- <exception caught here> ---
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/buildbot-0.8.7-py2.7.egg/buildbot/changes/gitpoller.py", line 116, in poll
            yield self._process_changes(rev, branch)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 1045, in _inlineCallbacks
            result = g.send(result)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/buildbot-0.8.7-py2.7.egg/buildbot/changes/gitpoller.py", line 205, in _process_changes
            self._get_commit_files(rev),
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/buildbot-0.8.7-py2.7.egg/buildbot/changes/gitpoller.py", line 154, in _get_commit_files
            d = self._dovccmd('log', args, path=self.workdir)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/buildbot-0.8.7-py2.7.egg/buildbot/changes/gitpoller.py", line 232, in _dovccmd
            [command] + args, path=path, env=os.environ)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/utils.py", line 169, in getProcessOutputAndValue
            reactor)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/utils.py", line 25, in _callProtocolWithDeferred
            reactor.spawnProcess(p, executable, (executable,)+tuple(args), env, path)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/posixbase.py", line 346, in spawnProcess
            processProtocol, uid, gid, childFDs)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 689, in __init__
            self._fork(path, uid, gid, executable, args, environment, fdmap=fdmap)
          File "/modules/buildbot/0.8.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 396, in _fork
            self.pid = os.fork()
        exceptions.OSError: [Errno 513] Unknown error 513

2012-10-08 16:00:45+0200 [-] gitpoller: processing 0 changes: [] from "git@git:my_project"
2012-10-08 16:00:45+0200 [-] gitpoller: processing 0 changes: [] from "git@git:my_project"
2012-10-08 16:00:45+0200 [-] gitpoller: processing 0 changes: [] from "git@git:my_project"
2012-10-08 16:00:58+0200 [-] Loading builder lint on master's build 167 from on-disk pickle

So, this is a different bug than the original, but we might as well handle it here.

This looks like an OS-level problem -- why can't Buildbot fork? selinux maybe? I don't have a 513 error code on my Linux system. You don't specify an operating system.

comment:7 Changed 7 months ago by Clement

I'm using CentOS release 5.4 (Final) as os and the filesystem is distibuted using NFS. In my sense, the problem isn't really the exception (which could occur because of a network problem,...) but the catching of it made by buildbot: I'd expect buildbot to retry the polling of the commit when it catches the exception (say 5 retries with some seconds of delay in between to see if the system is now responding.)

Last edited 7 months ago by Clement (previous) (diff)

comment:8 Changed 7 months ago by tom.prince

Well, so the code will retry on the next poll. It does appear that if some of the changes get processed, that they will get added again on the next poll as well (this should be fxied).

comment:9 Changed 7 months ago by tom.prince

  • Type changed from undecided to enhancement
  • Milestone changed from undecided to 0.8.+

Clement: Can you confirm that buildbot is polling at the next scheduled interval, and getting all the commits then? (Or demonstrate that it isn't?)

comment:10 Changed 6 months ago by tom.prince

  • Status changed from reopened to closed
  • Resolution set to worksforme
Note: See TracTickets for help on using tickets.