Opened 10 years ago

Last modified 3 years ago

#371 new enhancement

ShellCommand argument "logfiles" does not properly work for HTML log files

Reported by: cli Owned by:
Priority: minor Milestone: 0.9.+
Version: Keywords: web
Cc:

Description (last modified by dustin)

The logfiles argument can be used to simply transmit logfiles from a slave to the master. This works well for text/plain files.

But if we want to observe a text/html file, the HTML source is escaped as text and therefor useless to a viewer. Perhaps one should add a simple file type detection (perhaps on file ending basis). I think someone has done much work on the logfile handling (HTMLLog and HTMLLogFile are already there) but the final glue is missing to get the thing working.

Change History (15)

comment:1 Changed 9 years ago by dustin

  • Milestone changed from undecided to 0.7.+

comment:2 Changed 8 years ago by willw

I'm very interested in getting this working as our testing outputs HTML files over 2MB in size - the issue with BuildStep?.addHTMLLog() is it requires you to pass all the log data in memory. The other big issue is that the waterfall seems to forget the HTML log data occasionally.

What would be the easiest approach I could take to simply stream the generated HTML log file using the existing system, but avoid escaping the data?

comment:3 Changed 8 years ago by dustin

Streaming isn't simple :)

That would have to be a custom step, with custom code on the slave, too.

comment:4 Changed 8 years ago by dustin

  • Keywords logfile html removed

comment:5 Changed 7 years ago by lantictac

Here's the hack we're using locally here. Be warned it's ugly as hell and not ready for prime time but it does the job for us. Far more prefereable would be a MIME type associated with the log file.

In logs.py...

# /builders/$builder/builds/$buildnum/steps/$stepname/logs/$logname
class TextLog(Resource):
    # a new instance of this Resource is created for each client who views
    # it, so we can afford to track the request in the Resource.
    implements(IHTMLLog)

    asText = False
    asHTML = False
    subscribed = False

    def __init__(self, original):
        Resource.__init__(self)
        self.original = original
		
        # WPW: Awful, awful hack to detect HTML contents
        if original.getText().startswith('<html'):	# Original is a status.builder.LogFile instance
            self.asHTML = True;

    def getChild(self, path, req):
        if path == "text":
            self.asText = True
            return self
        elif path == "html":
            self.asHTML = True
            return self
        return HtmlResource.getChild(self, path, req)

    def content(self, entries):
        html_entries = []
        text_data = ''
        for type, entry in entries:
            if type >= len(builder.ChunkTypes) or type < 0:
                # non-std channel, don't display
                continue
            
            is_header = type == builder.HEADER

            if not self.asText and not self.asHTML:

                # jinja only works with unicode, or pure ascii, so assume utf-8 in logs
                if not isinstance(entry, unicode):
                    entry = unicode(entry, 'utf-8', 'replace')
                html_entries.append(dict(type = builder.ChunkTypes[type], 
                                         text = entry,
                                         is_header = is_header))
            elif not is_header:
                text_data += entry

        if self.asText or self.asHTML:
            return text_data
        else:
            return self.template.module.chunks(html_entries)

    def render_HEAD(self, req):
        self._setContentType(req)

        # vague approximation, ignores markup
        req.setHeader("content-length", self.original.length)
        return ''

    def render_GET(self, req):
        self._setContentType(req)
        self.req = req

        if not self.asText and not self.asHTML:
            self.template = req.site.buildbot_service.templates.get_template("logs.html")                
            
            data = self.template.module.page_header(
                    title = "Log File contents",
                    texturl = req.childLink("text"),
                    path_to_root = path_to_root(req))
            data = data.encode('utf-8')                   
            req.write(data)

        self.original.subscribeConsumer(ChunkConsumer(req, self))
        return server.NOT_DONE_YET

    def _setContentType(self, req):
        if self.asText:
            req.setHeader("content-type", "text/plain; charset=utf-8")
        else:
            req.setHeader("content-type", "text/html; charset=utf-8")
        
    def finished(self):
        if not self.req:
            return
        try:
            if not self.asText and not self.asHTML:

                data = self.template.module.page_footer()
                data = data.encode('utf-8')
                self.req.write(data)
            self.req.finish()
        except pb.DeadReferenceError:
            pass
        # break the cycle, the Request's .notifications list includes the
        # Deferred (from req.notifyFinish) that's pointing at us.
        self.req = None
        
        # release template
        self.template = None

components.registerAdapter(TextLog, interfaces.IStatusLog, IHTMLLog)

comment:6 Changed 7 years ago by dustin

Indeed, that's a pretty awful hack. There's a good bit of room for extra configuration options when specifying logfiles, so perhaps the format of the logfile could be specified there? MIME type is probably a good way to specify it.

comment:7 Changed 6 years ago by dustin

  • Keywords web added

comment:8 Changed 6 years ago by materialdreams

We as well would be most interested to get this feature and at least for us it would also be feasible to simply statically specify the log file type in the build step.

comment:9 Changed 5 years ago by dustin

  • Milestone changed from 0.8.+ to 0.9.0

The log handling in 'nine' will handle HTML logs in chunks, just like text logs. So the buildmaster will not need to load the entire multi-GB log into memory at one time. It will still get loaded into the browser, of course, and I'm not entirely sure how to handle chunks of HTML in JS.

comment:10 follow-up: Changed 4 years ago by dustin

  • Description modified (diff)
  • Version 0.7.9 deleted

This is possible now - this would just be a logfile of type 'h'.

comment:11 Changed 4 years ago by tardyp

looking at the code, the whole logfiles functionality is probably broken.

                # tell the BuildStepStatus to add a LogFile
                newlog = self.addLog(logname)
                # and tell the RemoteCommand to feed it
                cmd.useLog(newlog, True)

This is old style addLog, without deferred. I don't think cmd.useLog will manage asynclog proxies.

comment:12 Changed 4 years ago by dustin

The code you snipped is in LoggingBuildStep, which is always old-style, so it doesn't need to handle a Deferred. And while useLog will accept whatever you give it, the command will unwrap any SyncLogFileWrapper instances that it receives before actually using them. So I think that this is quite possible, even for old-style steps, although I'd have no problem seeing it only implemented for new-style steps.

comment:13 Changed 3 years ago by dustin

  • Milestone changed from 0.9.0 to 0.9.+
  • Priority changed from major to minor

comment:14 in reply to: ↑ 10 Changed 3 years ago by materialdreams

Replying to dustin:

This is possible now - this would just be a logfile of type 'h'.

Could you please elaborate on what you mean ? It is already possible to log an HTML file ?

comment:15 Changed 3 years ago by gtmacdonald

I would also like to know if this is possible now.

Note: See TracTickets for help on using tickets.