Ticket #2183 (closed defect: worksforme)

Opened 16 months ago

Last modified 15 months ago

SVNPoller stops after failure

Reported by: fgsch Owned by:
Priority: critical Milestone: 0.8.6
Version: 0.8.5 Keywords:
Cc:

Description (last modified by dustin) (diff)

I arrived today at work and buildbot was not working. Checking the logs I found:

2012-01-20 02:30:28+0000 [-] SVNPoller: Error in  while polling
        Traceback (most recent call last):
        Failure: twisted.internet.utils._UnexpectedErrorOutput: got stderr: "svn: REPORT request failed on '/xxx/!svn/bc/25979/packages/yyy/trunk'
svn: REPORT of '/xxx/!svn/bc/25979/packages/yyy/trunk': Could not read chunk size: Secure connection truncated (https://server)
"

And:

2012-01-20 02:32:25+0000 [-] Unhandled error in Deferred:
2012-01-20 02:32:25+0000 [-] Unhandled Error
        Traceback (most recent call last):
        Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1.

And another ocurrence for the second SVNPoller (my change_source polls 2 different locations within the same repo).

After the second error the logs were showing only one polling. Before:

2012-01-20 02:26:23+0000 [-] SVNPoller: polling
2012-01-20 02:26:23+0000 [-] SVNPoller: polling
2012-01-20 02:26:25+0000 [-] SVNPoller: no changes
2012-01-20 02:26:25+0000 [-] SVNPoller: _process_changes 25898 .. 25898
2012-01-20 02:26:25+0000 [-] SVNPoller: finished polling None
2012-01-20 02:26:30+0000 [-] SVNPoller: no changes
2012-01-20 02:26:30+0000 [-] SVNPoller: _process_changes 25979 .. 25979
2012-01-20 02:26:30+0000 [-] SVNPoller: finished polling None

After:

2012-01-23 09:44:23+0000 [-] SVNPoller: polling
2012-01-23 09:44:25+0000 [-] SVNPoller: no changes
2012-01-23 09:44:25+0000 [-] SVNPoller: _process_changes 25898 .. 25898
2012-01-23 09:44:25+0000 [-] SVNPoller: finished polling None

Instance details:

  • twistd 11.1.0
  • buildbot.version: 0.8.5

Attachments

debugging.patch Download (2.9 KB) - added by dustin 16 months ago.
debugging.patch

Change History

comment:1 Changed 16 months ago by dustin

  • Description modified (diff)
  • Milestone changed from undecided to 0.8.6

comment:2 Changed 16 months ago by dustin

I may need some more information here.

The first error is expected -- it's reporting a failure -- but I don't see how that would stop the poller from doing its polling thing. The string "Error in while polling" is from

#! python
        d.addErrback(log.err, 'SVNPoller: Error while polling:') # eat errors
        return d

in poll(). That's an unconditional "if an error occurs, log it and resume normal processing of the deferred". So that deferred's next value is a successful None.

Back in base.py, poll() is called in

#! python
        def do_poll():
            d = defer.maybeDeferred(self.poll)
            d.addErrback(log.err, 'while polling for changes')
            return d

which is *another* unconditional catch-all. And while your log snippets are abbreviated, I assume you would have included this output if it existed. That it doesn't makes sense - that deferred was successful. So the LoopingCall? has no reason to stop looping.

Now, the second error, almost two minutes later:

2012-01-20 02:32:25+0000 [-] Unhandled error in Deferred:

That's clearly not related to the *same* failure as two minutes earlier, and in fact it's not obvious that's related to svn at all. Looking at the invocations of getProcessOutput in the poller, all feed their deferreds back into the main poll deferred, which as we've seen is protected against unhandled errors.

I think the next step is to figure out what that second, unhandled error is about, as that will point us to the bug. Can you tell what was around that in the logs? If not, we may need to add some debugging code.

comment:3 Changed 16 months ago by fgsch

Unfortunately there was nothing else between those 2 logs, all the information I had is in this report. Let me know what debugging code I need to add should this happen again.

comment:4 Changed 16 months ago by dustin

How often does your poller run?

Try this:

  • master/buildbot/changes/svnpoller.py

    diff --git a/master/buildbot/changes/svnpoller.py b/master/buildbot/changes/svnpoller.py
    index 93c14bb..70845b2 100644
    a b class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    162162 
    163163    def getProcessOutput(self, args): 
    164164        # this exists so we can override it during the unit tests 
     165        log.msg("svnpoller %s GPO %s" % (self.svnurl, args)) 
    165166        d = utils.getProcessOutput(self.svnbin, args, self.environ) 
     167        @d.addBoth 
     168        def report(x): 
     169            log.msg("svnpoller %s GPO %s -> %s" % (self.svnurl, args, x)) 
     170            return x 
    166171        return d 
    167172 
    168173    def get_prefix(self): 
     174        log.msg("svnpoller %s get_prefix" % (self.svnurl,)) 
    169175        args = ["info", "--xml", "--non-interactive", self.svnurl] 
    170176        if self.svnuser: 
    171177            args.extend(["--username=%s" % self.svnuser]) 
    class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    201207        return d 
    202208 
    203209    def get_logs(self, _): 
     210        log.msg("svnpoller %s get_logs" % (self.svnurl,)) 
    204211        args = [] 
    205212        args.extend(["log", "--xml", "--verbose", "--non-interactive"]) 
    206213        if self.svnuser: 
    class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    212219        return d 
    213220 
    214221    def parse_logs(self, output): 
     222        log.msg("svnpoller %s parse_logs" % (self.svnurl,)) 
    215223        # parse the XML output, return a list of <logentry> nodes 
    216224        try: 
    217225            doc = xml.dom.minidom.parseString(output) 
    class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    223231 
    224232 
    225233    def get_new_logentries(self, logentries): 
     234        log.msg("svnpoller %s get_new_logentries" % (self.svnurl,)) 
    226235        last_change = old_last_change = self.last_change 
    227236 
    228237        # given a list of logentries, calculate new_last_change, and 
    class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    275284        return where 
    276285 
    277286    def create_changes(self, new_logentries): 
     287        log.msg("svnpoller %s create_changes" % (self.svnurl,)) 
    278288        changes = [] 
    279289 
    280290        for el in new_logentries: 
    class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    348358 
    349359    @defer.deferredGenerator 
    350360    def submit_changes(self, changes): 
     361        log.msg("svnpoller %s submit_changes" % (self.svnurl,)) 
    351362        for chdict in changes: 
    352363            wfd = defer.waitForDeferred(self.master.addChange(src='svn', 
    353364                                                              **chdict)) 
    class SVNPoller(base.PollingChangeSource, util.ComparableMixin): 
    355366            wfd.getResult() 
    356367 
    357368    def finished_ok(self, res): 
     369        log.msg("svnpoller %s finished_ok" % (self.svnurl,)) 
    358370        if self.cachepath: 
    359371            f = open(self.cachepath, "w") 
    360372            f.write(str(self.last_change)) 

Changed 16 months ago by dustin

debugging.patch

comment:5 Changed 16 months ago by fgsch

Currently it runs every minute but I will apply your diff and change it to 5 minutes on Monday.

comment:6 Changed 16 months ago by dustin

Any info?

comment:7 Changed 15 months ago by fgsch

I checked the logs today and it has not crashed since this report. This can be closed now. I will open a new ticket should it happen again.

comment:8 Changed 15 months ago by dustin

  • Status changed from new to closed
  • Resolution set to fixed

Thanks!

comment:9 Changed 15 months ago by dustin

  • Status changed from closed to reopened
  • Resolution fixed deleted

comment:10 Changed 15 months ago by dustin

  • Status changed from reopened to closed
  • Resolution set to worksforme
Note: See TracTickets for help on using tickets.