Ticket #557 (closed defect: fixed)

Opened 4 years ago

Last modified 4 years ago

DirectoryUpload needs to divide its files into blocks

Reported by: stefan Owned by: stefan
Priority: critical Milestone: 0.7.11
Version: 0.7.10 Keywords:
Cc: dusin

Description

I'm running DirectoryUpload? to upload a directory full of generated documentation. In the middle of that the build step aboards with twisted.spread.banana.BananaError?: string is too long to send (2408624)

Attachments

patch Download (9.6 KB) - added by stefan 4 years ago.
a patch…
patch.2 Download (4.8 KB) - added by stefan 4 years ago.
Let users decide compression type

Change History

comment:1 Changed 4 years ago by stefan

...and attempting to attach a log file resulted in an internal error:

Trac detected an internal error:

OSError: [Errno 13] Permission denied: '/home/buildbot/trac-buildbot/attachments/ticket/557'

FWIW

comment:2 Changed 4 years ago by stefan

Here is the log again, this time inline:

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/flavors.py", line 117, in remoteMessageReceived

state = method(*args, kw)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/buildbot/slave/bot.py", line 173, in remote_startCommand

d = self.command.doStart()

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/buildbot/slave/commands.py", line 732, in doStart

d = defer.maybeDeferred(self.start)

--- <exception caught here> ---

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/internet/defer.py", line 106, in maybeDeferred

result = f(*args, kw)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/buildbot/slave/commands.py", line 972, in start

self._writeFile(filename)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/buildbot/slave/commands.py", line 982, in _writeFile

self.writer.callRemote('write', data)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/pb.py", line 340, in callRemote

_name, args, kw)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/pb.py", line 833, in _sendMessage

self.sendCall(prefix+"message", requestID, objectID, message, answerRequired, netArgs, netKw)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/pb.py", line 547, in sendCall

self.sendEncoded(exp)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/banana.py", line 277, in sendEncoded

self._encode(obj, io.write)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/banana.py", line 289, in _encode

self._encode(elem, write)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/banana.py", line 289, in _encode

self._encode(elem, write)

File "/usr/local/tools/hpc-2.0/lib/python2.6/site-packages/twisted/spread/banana.py", line 318, in _encode

"string is too long to send (%d)" % (len(obj),))

twisted.spread.banana.BananaError?: string is too long to send (2408624)

comment:3 Changed 4 years ago by marcusl

FWIW, I just enabled DirectoryUpload? on our build and I get this too. (on Python 2.5, 0.7.10p1, XP)

comment:4 Changed 4 years ago by marcusl

It seems as if twisted.spread.banana only allows 640kb files, looking at  the source.

comment:5 Changed 4 years ago by dustin

  • Cc dusin added
  • Owner set to dustin
  • Status changed from new to accepted

Hmm, so we need to shrink the write size?

I'll also poke at the Trac permission issues.

comment:6 Changed 4 years ago by dustin

  • Milestone set to undecided

comment:7 Changed 4 years ago by marcusl

Yup. Don't send chunks larger than twisted.spread.banana.SIZE_LIMIT.

comment:8 Changed 4 years ago by dustin

Woah -- this code makes a lot of callRemote calls, but doesn't handle the deferreds for any of them. I guess this works because they're delivered in order, but it's pretty yucky..

comment:9 Changed 4 years ago by dustin

  • Summary changed from Using DirectoryUpload with a big directory leads to error to DirectoryUpload needs to divide its files into blocks
  • Milestone changed from undecided to 0.7.11

In fact, I'm tempted to use the tarfile module to serialize the files and directories, and then just chunk its output up into blocks. The FileUpload? and FileDownload? classes correctly chunk their transfers.

comment:10 Changed 4 years ago by marcusl

Sounds like a plan. That would fix #559 as well.

comment:11 Changed 4 years ago by stefan

Using the tarfile module sounds like an elegant solution, and an efficient one. May be even an optional compression (in case the master/slave connection is slow) would be useful. Let me know if there is code ready to be tried out.

comment:12 Changed 4 years ago by stefan

Has anything been done to fix this ? If not, I could attempt to look into it. Just trying to avoid duplicating work. Thanks.

comment:13 Changed 4 years ago by dustin

  • Owner changed from dustin to stefan
  • Status changed from accepted to assigned

No - I've been avoiding it, because it looks like a decent amount of coding is required.

Changed 4 years ago by stefan

a patch...

comment:14 Changed 4 years ago by stefan

I have added a new test, hoping that I could start by reproducing the problem from within the current buildbot tree, but failed. (I'm not sure whether that is because the issue has been solved in some other way, or what else.)

In any case, as it seemed a good idea to rewrite DirectoryUpload? to use tarfile anyhow, I moved forward with that idea. Here is a patch that contains that change, as well as the added test case.

All seems to work fine (read: all transfer tests still pass), though I would be more satisfied if the added test would have failed without the change.

As I'm not very familiar with twisted, I'd appreciate if someone could carefully review my code.

Thanks !

comment:15 Changed 4 years ago by dustin

  • Status changed from assigned to closed
  • Resolution set to fixed

The only twisted adjustment I made was to change addBoth to addCallback in:

994         d.addCallback(unpack)

because we don't want to call unpack() in the event of an upload exception.

This solution looks great, and I'm going to commit it. I wish tarfile were a bit more flexible, so that we could avoid writing the tarfile to disk on the slave and just stream the files directly into tar format (more of a "tar -jcf - $path | master" instead of "tar -jcf tmpfile $path; cat tmpfile | master"). By my read of tarfile, though, this isn't possible, or at least not easy.

commit [77ca3ff84ffc8a2f40d060ef628984e1b684cffe]

Author: Stefan Seefeld <seefeld@sympatico.ca>
Date:   Sun May 24 22:09:00 2009 -0400

    (refs #557, #559) use tarfile to upload multiple files from the master

comment:16 Changed 4 years ago by dustin

Thank you for writing this, by the way!

comment:17 Changed 4 years ago by stefan

You are welcome. I'm glad I could contribute to a very useful tool !

comment:18 Changed 4 years ago by stefan

  • Status changed from closed to reopened
  • Resolution fixed deleted

I just attempted to use my patch in a production environment, and ran into a problem: The original patch uses bz2 compression, but my environment doesn't have it. Thus I propose a slight change, where users can add a 'compress' attribute (either None, 'gz', or 'bz2') to the DirectoryUpload? step.

Changed 4 years ago by stefan

Let users decide compression type

comment:19 Changed 4 years ago by marcusl

  • Status changed from reopened to closed
  • Resolution set to fixed

Fine, but that should be a new ticket, i.e. #577. :) (sorry about the confusing numbers)

comment:20 Changed 4 years ago by stefan

  • Status changed from closed to reopened
  • Resolution fixed deleted

Sorry, I really don't see the need for that. I consider my previous patch as buggy (and thus this issue as not-quite-fixed-yet, so I add another patch. What's wrong with that ? I have not enough time to get stuck in formalities, sorry.

comment:21 Changed 4 years ago by marcusl

Ok. It doens't really matter, except that the first patch has already been merged. Thus I consider this a different thing.

But you're right. It doens't matter.

comment:22 Changed 4 years ago by stefan

ping ?

comment:23 Changed 4 years ago by dustin

  • Status changed from reopened to closed
  • Resolution set to fixed

Thanks for the ping -- merged in [6f7ea917fc5cc3e2d2bdf7aaabdc1c8bf7e0ba8f]

Note: See TracTickets for help on using tickets.