Ticket #342 (closed defect: fixed)
BuildBot crashes if a non-ascii is sent with 'force build'
| Reported by: | marcusl | Owned by: | dustin |
|---|---|---|---|
| Priority: | critical | Milestone: | 0.7.11 |
| Version: | 0.7.10 | Keywords: | |
| Cc: | Pike, dustin |
Description
Entering a name with non-ascii chars when doing a 'force build' makes the waterfall view crash horribly with 'exceptions.UnicodeDecodeError?: 'ascii' codec can't decode byte 0xXX in position YYYY: ordinal not in range(128)'.
A fellow named Jonas Byström did this at our place. I had to delete the offending build-pickle files to make waterfall work again.
Attachments
Change History
comment:2 Changed 5 years ago by tjensen
We are also seeing this error with the waterfall whenever someone uses locale specific characters in a commit message. (ie. æøå)
comment:3 Changed 4 years ago by marcusl
- Priority changed from major to critical
That's not good. I'm upping this to critical again.
comment:4 Changed 4 years ago by dustin
- Milestone changed from undecided to 0.7.10
Looks like I need to do a thorough analysis of charset handling. This is always fun in the world of web services :(
comment:7 Changed 4 years ago by dustin
- Status changed from assigned to closed
- Resolution set to worksforme
- Milestone changed from 0.7.10 to undecided
OK, I really can't replicate this. I'm going to close it for the moment, but please re-open with a full traceback if you see it again in 0.7.10.
comment:8 Changed 4 years ago by marcusl
I just tested, and no, it doesn't seem to reproduce on 0.7.9 either.
Good to know if I get more buildbot questions from my old company. :)
Changed 4 years ago by etienne
-
attachment
waterfall_crash.zip
added
Crash of the waterfall display with version 0.7.10p1
comment:9 Changed 4 years ago by etienne
- Status changed from closed to reopened
- Resolution worksforme deleted
I had the same problem with version 0.7.10p1 and I can attach a full traceback, I hope it's not too big.
I have (taken from about) :
- Buildbot 0.7.10p1
- Twisted 8.2.0
- Python 2.4.3
- Buildmaster platform: linux2
comment:10 Changed 4 years ago by dustin
OK, I don't have any clear idea how to fix this, but the problem is that non-ASCII characters are getting into the HTML output (data). When those are combined with the Unicode output from b.td(), then Python tries to upconvert data into Unicode by *decoding* it from the "ascii" encoding. Which doesn't work.
So the problem is actually that invalid ASCII symbols are getting injected on *input* (via the "stop" button, in this case), and getting stored in the pickles. It's always tricky to figure out which encoding input from a browser is using, so that's the part I don't know how to solve. I can try to solve the crashes, at least.
See if this patch helps? http://github.com/djmitche/buildbot/commit/852871732326cf4d62322dd70040dddabd6edf42
comment:11 Changed 4 years ago by dustin
- Cc dustin added
- Version changed from 0.7.7 to 0.7.10
- Milestone changed from undecided to 0.7.11
comment:12 Changed 4 years ago by Pike
At least you found what crashed, that stack left me helpless. Not that I totally understand what's going on so far anyway.
The encoding of the data we receive is the encoding of the webpage we serve, btw, so it's not really all that hard to figure out, misconfigured servers withstanding.
I'd probably enforce utf-8, I would hope that at least the plain buildbot is setup to use that as encoding.
comment:13 Changed 4 years ago by dustin
Yeah, the fix here is clearly to get the encodings right on input and output, and assuming utf-8 on input is probably workable. However the mash-strings-together school of HTML generation makes this kind of conversion *really* difficult.
If someone wants to rejigger the HTML generation to be all unicode, all the time, and fix the places where form input enters the database, I'd be happy to commit it.
comment:14 Changed 4 years ago by marcusl
Vaguely related and to dig up an old corpse, I've seen some people use Twisted and Django together. I hope to get time to do some experiements for one-oh in that area.
We could then at least use Django for HTML generation, if nothing else.
comment:15 follow-up: ↓ 16 Changed 4 years ago by dustin
I committed the partial fix referenced above in [e3abd0739af2408d676817f6d5b091ee5a4d49dd]
comment:16 in reply to: ↑ 15 Changed 4 years ago by catlee
Replying to dustin:
I committed the partial fix referenced above in [e3abd0739af2408d676817f6d5b091ee5a4d49dd]
Please change how this is done. Decoding a unicode string is probably not what you want to do, and decoding an ever growing string is definitely not good. It's O(n2) in the number of grid elements if I'm not mistaken.
Maybe something like: diff --git a/buildbot/status/web/waterfall.py b/buildbot/status/web/waterfall.py index c3bd5c6..0e70f28 100644 --- a/buildbot/status/web/waterfall.py +++ b/buildbot/status/web/waterfall.py @@ -949,12 +949,11 @@ class WaterfallStatusResource?(HtmlResource?):
# third pass: render the HTML table for i in range(gridlen):
data += " <tr>
";
- # convert data to a unicode string, whacking any non-ASCII characters it might contain
- data = data.decode("ascii", "replace")
for strip in grid:
b = strip[i] if b:
- data += b.td()
+ # convert data to a unicode string, whacking any non-ASCII characters it might contain + data += b.td().encode("utf-8", "replace")
else:
if noBubble:
data += td([])
comment:17 Changed 4 years ago by dustin
- Status changed from reopened to closed
- Resolution set to fixed
Chris took care of this in [6b184718ee9576c0577c4fd9fb8dda7a2edf8754]. Thanks!
![[Buildbot Logo]](/chrome/site/header-text-transparent.png)