Opened 8 years ago

Last modified 4 years ago

#739 new defect

HTML logs are included in pickles

Reported by: marcusl Owned by:
Priority: critical Milestone: 0.9.+
Version: 0.7.12 Keywords: sprint

Description (last modified by dustin)

I recently started adding HTML logs from my buildsystem to buildbot, and these are between 1 and 5 mb per buildstep.

They do not seem to be compressed at all, so there is quite a lot of memory and disk-space taken up by these log files. (One of my masters is using 450mb)

Having the logs gzipped would be neat, since they could then be sent raw to clients who accept gzipped html (most browsers nowadays).

Either that, or we fold the HTML-log and text-log into a generic logging store (with mime-type or something).

Change History (16)

comment:1 Changed 8 years ago by marcusl

Also, the build-pickle seems to contain all logs as well (the 555 file is about the size of the sum of all logs below, similar results for other builds:)

-rw------- 1 root root 10524631 Mar 11 12:57 555
-rw------- 1 root root  1054001 Mar 11 12:26 555-log-build-buildlog
-rw------- 1 root root     9897 Mar 11 12:26 555-log-build-stdio.bz2
-rw------- 1 root root      891 Mar 11 12:26 555-log-build-warnings
-rw------- 1 root root   158168 Mar 11 12:07 555-log-clean-buildlog
-rw------- 1 root root     1915 Mar 11 12:07 555-log-clean-stdio.bz2
-rw------- 1 root root     1797 Mar 11 12:01 555-log-hg-stdio.bz2
-rw------- 1 root root  6237923 Mar 11 12:57 555-log-install-buildlog
-rw------- 1 root root     5587 Mar 11 12:57 555-log-install-stdio.bz2
-rw------- 1 root root       58 Mar 11 12:06 555-log-setup-property_changes
-rw------- 1 root root     5763 Mar 11 12:06 555-log-setup-stdio.bz2
-rw------- 1 root root  3080263 Mar 11 12:35 555-log-test-buildlog
-rw------- 1 root root     7500 Mar 11 12:35 555-log-test-stdio.bz2
-rw------- 1 root root      219 Mar 11 12:35 555-log-test-warnings

I'll try to get some time to test the 0.8 beta and see if it helps, but some initial advice would be nice.

comment:2 Changed 8 years ago by dustin

  • Milestone changed from undecided to 0.8.1

If you can decode one of those pickles and find where the logfile is stuck, that should be easy to fix. I'll be happy to backport that to the 0.8.0 release branch.

Compressing them .. harder, but a good project.

comment:3 Changed 8 years ago by dustin

  • Priority changed from major to critical

marcus, are you still having this problem? Can you send me a build pickle?

comment:4 Changed 8 years ago by dustin

  • Milestone changed from 0.8.2 to 0.8.3

comment:5 Changed 8 years ago by marcusl

Uh. Sorry for the quietness. I didn't see your comments.

I'll get back to this once we update our buildbot install to 0.8.x at work. (within a few months..)

comment:6 Changed 8 years ago by marcusl

Or rather, on monday, I can try to disassemble the pickle or mail it to you. My vacation for this summer is now over, and I have to work for my salary again. ;)

comment:7 Changed 7 years ago by ayust

  • Milestone changed from 0.8.3 to 0.8.+

comment:8 Changed 6 years ago by dustin

  • Keywords sprint added

This code hasn't changed much in the intervening two years, so this bug is likely still present. Let's try to reproduce, then take apart the resulting pickle and write a fix.

comment:9 Changed 5 years ago by dustin

  • Summary changed from HTML logs do not get compressed to HTML logs are included in pickles

comment:10 Changed 5 years ago by dcoshea

I'm running from git master commit ID 9cd7f9a (around 6 weeks ago). I have a build where the web status shows the following due to some error in my config (a doStepIf lambda that took the wrong number of parameters, I think). I assume this is basically a reproduction of this bug, unless there is some threshold below which the log content is meant to be stored inside the pickle:

[...] Build #3
Steps and Logfiles:
4. <step name> <step name> exception ( 0 secs )

In the builder's directory on the master, the file 3-log-<step name>-err.text exists but there is no corresponding .html file.

>>> o = pickle.load(open("3"))
>>> o.steps[3].logs
[<buildbot.status.logfile.LogFile instance at 0x22626c8>, <buildbot.status.logfile.HTMLLogFile instance at 0x2262710>]
>>> o.steps[3].logs[1].filename   
'3-log-<step name>-err.html'

No such file exists, though. There seem to be two ways to get the content of the log from the object:

>>> o.steps[3].logs[1].html
'<div>\n  <style type="text/css">\n[...]\n</div>'
>>> o.steps[3].logs[1].getText()
'<div>\n  <style type="text/css">\n[...]\n</div>'

For the err.text log, though, I can't get the content from the pickle, which I assume is to be expected since it isn't stored in there:

>>> o.steps[3].logs[0].getText()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "[...]/buildbot/master/buildbot/status/", line 321, in getText
    return "".join(self.getChunks([STDOUT, STDERR], onlyText=True))
  File "[...]/buildbot/master/buildbot/status/", line 339, in getChunks
    f = self.getFile()
  File "[...]/buildbot/master/buildbot/status/", line 310, in getFile
    return BZ2File(self.getFilename() + ".bz2", "r")
  File "[...]/buildbot/master/buildbot/status/", line 241, in getFilename
    return os.path.join(, self.filename)
AttributeError: LogFile instance has no attribute 'step'

comment:11 Changed 5 years ago by dustin

  • Description modified (diff)

That looks like an accurate reproduction of the bug! I'm surprised that the HTML file doesn't exist, though - that makes it a little more complex to fix. The patch would need to look in both locations for the HTML data, write the logfile out if it didn't already exist, and remove the HTML data from the stream written to the pickle.

I'm in the process of removing this code for 'nine' anyway, so personally I don't feel motivated to work on a fix, but certainly others are welcome to do so!

comment:12 Changed 4 years ago by jpommerening

Hey! I'm working on a fix:

The main problem with HTML-logs (in my opinion) is that they can not (yet) be streamed from the slaves. So if you have a 5MB log, you first have to collect the whole contents into an enormous string and pass it to BuildStepStatus.addHTMLLog. The fact that it will stay in memory and will stay there for as long as the whole BuildStatus instance lives only makes matters worse.

OTOH, for small stack traces like the err.html, keeping them in memory may be the way to go…

Another thing I'm not really comfortable with is the way the HTMLLogFile class serves as a kind of hint for the web status to handle it differently. Maybe a LogFile should know it's content-type…?

Last edited 4 years ago by jpommerening (previous) (diff)

comment:13 Changed 4 years ago by dustin

The issue of treating HTML as a single string is already fixed in nine, where HTML logs are treated just like any other log -- streamed in chunks.

Fixing the problem of storing the HTML in a pickle is a good improvement to the 0.8.x series, which I'm sure people will still be running for a good while after nine is released, so this is definitely worthwhile.

That said, I really hate to see you mucking about in the awfulness that is status plugins, pickles, LogFile and HTMLLogFile, and so on. They're all going to die in a fire in the nine branch, and be replaced with much less awful things. You're correctly identifying lots of ways the old stuff is awful.

Come to the nine side! We have cookies!

comment:14 Changed 4 years ago by jpommerening

Haha, I will! Sooner rather than later.

I admit that I wasn't really aware of the sheer amount of work and thought that already went into nine. The developer docs are a great read and the new APIs look wonderful!

I'll put just a little more effort into giving 0.8.x a graceful way to die (and really, I just want to see these "critical defect" tickets disappear). But you're right – I should have a closer look at nine first, so I don't try to reinvent its concepts by myself :)

comment:15 Changed 4 years ago by dustin

Awesome, thanks - I agree with your priorities!

comment:16 Changed 4 years ago by dustin

  • Milestone changed from 0.8.+ to 0.9.+

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.