Opened 6 years ago

Last modified 6 years ago

#3070 new support-request

buildbot try hangs under Windows

Reported by: VZ Owned by:
Priority: minor Milestone: ongoing
Version: 0.8.9 Keywords:
Cc:

Description

I have buildbot 0.8.9-2 from Debian testing running under Linux and am trying to submit a patch for testing to it using buildbot try from Windows 7. Initially I used Cygwin Python and installed buildbot using pip there, but this didn't work at all:

# the options are specified in a config file and are presumably good
# as they work when using buildbot try under Linux
buildbot --verbose try
2014-11-25 23:47:11+0100 [-] Log opened.
2014-11-25 23:47:11+0100 [-] using 'ssh' connect method
... nothing any more ...

Even with --dryrun, nothing happens. A bit of debugging shows that it runs git branch but there is nothing after this, i.e. no file/network activity at all.

Next I tried the with the native Python 2.7.8 (32 bit version), using easy-install. Things were slightly better with it: with --dryrun things seem to work as expected, i.e. I see the "job created" message, followed by the job description including the diff and "job has been delivered" at the end. However without --dryrun it hangs again. This time I see that it runs ssh and, if I hack tryclient.py to use ssh -v command, I can even see that it connects to the master successfully:

2014-11-26 00:14:24+0100 [-] Log opened.
2014-11-26 00:14:24+0100 [-] using 'ssh' connect method
2014-11-26 00:14:25+0100 [-] job created
...ssh output snipped ...
2014-11-26 00:14:25+0100 [-] debug1: Sending command: buildbot tryserver --jobdir /var/lib/buildbot/master/jobdir

but nothing happens afterwards, there is no network traffic and the python process just loops inside select as far as I can see under the debugger.

Finally, I tried the sources from git, but they exhibit the same behaviour.

I have no idea what to do next, I tried following the code but got completely lost in the twisted callback logic (I love irony as much as anybody else but I do have to wonder if it's really such a great idea for a framework...). I would be very grateful for any pointers, TIA!

Change History (10)

comment:1 Changed 6 years ago by dustin

  • Milestone changed from undecided to ongoing
  • Type changed from defect to support-request

What's running on the far side of the SSH connection?

comment:2 Changed 6 years ago by VZ

If I use netstat -p I see the expected output, i.e. something like /usr/bin/python /usr/bin/buildbot tryserver --jobdir /var/lib/buildbot/master/jobdir.

comment:3 Changed 6 years ago by dustin

And what is *that* process doing?

comment:4 Changed 6 years ago by VZ

Running strace on it (notice, I omit the two parent ssh processes) shows that it just blocks reading stdin:

# strace -p 22692
Process 22692 attached - interrupt to quit
read(0,

comment:5 Changed 6 years ago by dustin

Hmm. I wonder if there's an un-flushed buffer somewhere. I'm not sure how to go about debugging that on the Windows side..

comment:6 Changed 6 years ago by VZ

Could you please tell me where is this buffer supposed to be written in the code?

comment:7 Changed 6 years ago by VZ

  • Priority changed from major to minor

Good news: your reply gave me an idea to try with plink (PuTTY command line ssh client) instead of Cygwin ssh and it indeed worked. I'll submit a patch adding try_ssh_command option to make it possible to use this without changes to the buildbot source code.

But it would still be nice if it could work with Cygwin ssh too (which is the default) and I still have no idea why it doesn't...

comment:8 Changed 6 years ago by dustin

That pretty much sums up every experience I've ever had with Cygwin. It works about 60% of the time. And the other 40% it just doesn't.

Patch sounds great :)

comment:9 Changed 6 years ago by VZ

Submitted PR.

I use Cygwin daily (and could hardly use Windows without it) and usually don't have any problems with it, so I still think it would be great to fix it, it still seems more likely that it's a problem with buildbot (or twisted) than with Cygwin itself. But as long as I have a workaround, it's not critical for me personally.

BTW, ssh doesn't account for the behaviour of the Cygwin version itself at all...

comment:10 Changed 6 years ago by Ben

Back in the time where I had a lot to do with Windows, I switched from cygwin to mingw for those kind of troubles ...

Note: See TracTickets for help on using tickets.