Opened 7 years ago

Last modified 5 years ago

#2451 new defect

multiple events_ already exist

Reported by: virgilg Owned by:
Priority: major Milestone: 0.8.x
Version: 0.8.7p1 Keywords:
Cc:

Description (last modified by dustin)

After upgrading from 0.8.5 to 0.8.7p1 we see the following during normal run. Can't paste the contents here because Akismet says the contents is spam (adjust spam rules on the server?)

EDIT: pasted by dustin

2013-02-15 14:51:02-0800 [Broker,36,17.202.80.187] Unhandled Error
        Traceback (most recent call last):
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/selectreactor.py", line 155, in _doReadOrWrite
            self._disconnectSelectable(selectable, why, method=="doRead")
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/posixbase.py", line 260, in _disconnectSelectable
            selectable.readConnectionLost(f)
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/tcp.py", line 257, in readConnectionLost
            self.connectionLost(reason)
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/tcp.py", line 277, in connectionLost
            protocol.connectionLost(reason)
        --- <exception caught here> ---
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/spread/pb.py", line 645, in connectionLost
            notifier()
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/spread/pb.py", line 1341, in maybeLogout
            fn()
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/pbmanager.py", line 160, in <lambda>
            return (pb.IPerspective, persp, lambda: persp.detached(mind))
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/buildslave.py", line 736, in detached
            AbstractBuildSlave.detached(self, mind)
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/buildslave.py", line 475, in detached
            self.botmaster.master.status.slaveDisconnected(self.slavename)
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/status/master.py", line 377, in slaveDisconnected
            t.slaveDisconnected(name)
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/status/status_push.py", line 323, in slaveDisconnected
            self.push('slaveDisconnected', slavename=slavename)
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/status/status_push.py", line 234, in push
            self.queue.pushItem(packet)
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/status/persistent_queue.py", line 284, in pushItem
            item = self.secondaryQueue.pushItem(item)
          File "/Library/Python/2.7/site-packages/buildbot-0.8.7p1-py2.7.egg/buildbot/status/persistent_queue.py", line 170, in pushItem
            raise IOError('%s already exists.' % path)
        exceptions.IOError: events_myserver.com/154 already exists.

Change History (11)

comment:1 Changed 7 years ago by virgilg

Can't attach the file with the traceback, Akismet still says it's spam. Great!

comment:2 Changed 7 years ago by dustin

Yeah, Trac either allows all spam or no attachments. Hopefully we'll be upgrading Trac soon to handle this better.

Can you put it in a pastebin and link it? I'll copy the contents here.

comment:4 Changed 7 years ago by dustin

  • Description modified (diff)

comment:5 Changed 7 years ago by dustin

A file named '154' already exists in that directory. From what I see, this can only happen if pushItem races with itself or with _loadFromDisk. But I may be missing something.

Do you, by chance, have two masters aimed at the same queue directory?

comment:6 Changed 7 years ago by virgilg

Do you, by chance, have two masters aimed at the same queue directory?

Nope.

A file named '154' already exists in that directory.

Is this file event created on the fly and then deleted? I only have "state" in that directory.

comment:7 Changed 7 years ago by dustin

Honestly, I know very little about this code. The PersistentQueue is used as a backing store for that is being pushed via HTTP, presumably to allow events to persist over a master restart if they cannot be sent to the HTTP server. It looks like persistent_queue.py deletes items as they are popped, so yes, I think they are created on the fly.

Does this error happen immediately at startup, or during runtime? Is it perfectly repeatable? If so, adding some print's to the handling of lastItemId in persistent_queue.py might help figure out what's going on.

comment:8 Changed 7 years ago by virgilg

Does this error happen immediately at startup, or during runtime?

During runtime

Is it perfectly repeatable?

No

If so, adding some print's to the handling of lastItemId in persistent_queue.py might help figure out what's going on.

I'll see about that. For now, it looks like the fix in 2450 has stopped most of the tracebacks (including this).

comment:9 Changed 7 years ago by dustin

Please close as invalid if you can't repeat. I'm not sure hwo #2450 would cause this, but I can't rule it out..

comment:10 Changed 7 years ago by dustin

The patch in #2450 seems to fix this - or at least make it not occur. virgilg's going to add

#! patch
diff --git a/master/buildbot/status/persistent_queue.py b/master/buildbot/status/persistent_queue.py
index 0106a21..a4e5c11 100644
--- a/master/buildbot/status/persistent_queue.py
+++ b/master/buildbot/status/persistent_queue.py
@@ -151,6 +151,7 @@ class DiskQueue(object):
         self._nbItems = 0
         # The actual items id start at one.
         self.firstItemId = 0
+        print "DQ %d - init - lastItemId = 0" % (id(self),)
         self.lastItemId = 0
         self._loadFromDisk()

@@ -164,6 +165,7 @@ class DiskQueue(object):
             self.firstItemId = id + 1
         else:
             self._nbItems += 1
+        print "DQ %d - pushItem - lastItemId += 1 -> %d" % (id(self), self.lastItemId+1)
         self.lastItemId += 1
         path = os.path.join(self.path, str(self.lastItemId))
         if os.path.exists(path):
@@ -245,6 +247,7 @@ class DiskQueue(object):
         if self._nbItems:
             self.firstItemId = files[0]
             self.lastItemId = files[-1]
+            print "DQ %d - _loadFromDisk - lastItemId = %d" % (id(self), self.lastItemId)


 class PersistentQueue(object):

and reproduce.

comment:11 Changed 5 years ago by dustin

  • Description modified (diff)
  • Milestone changed from undecided to 0.8.x
  • Type changed from undecided to defect

This code is gone in master, anyway.

Note: See TracTickets for help on using tickets.