Opened 3 years ago

Last modified 3 years ago

#3048 new defect

ConnectionError when the Docker host is not available

Reported by: Ben Owned by:
Priority: major Milestone: 0.8.x
Version: master Keywords: docker, slave, latent
Cc:

Description

2014-11-18 12:34:20+0100 [-] starting build <Build Trunk_ospd> using slave <LatentSlaveBuilder builder=u'Trunk_ospd' slave=u'ospd-python-slave'>
2014-11-18 12:34:20+0100 [-] acquireLocks(slave <DockerLatentBuildSlave u'ospd-python-slave'>, locks [])
2014-11-18 12:34:20+0100 [-] substantiating slave <LatentSlaveBuilder builder=u'Trunk_ospd' slave=u'ospd-python-slave'>
2014-11-18 12:34:20+0100 [-] Buildslave ospd-python-slave detached from Trunk_ospd
2014-11-18 12:34:20+0100 [-] while preparing slavebuilder:
        Traceback (most recent call last):
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks
            current.result = callback(current.result, *args, **kw)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1155, in gotResult
            _inlineCallbacks(r, g, deferred)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1097, in _inlineCallbacks
            result = result.throwExceptionIntoGenerator(g)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
            return g.throw(self.type, self.value, self.tb)
        --- <exception caught here> ---
          File "/home/benoit/buildbot/buildbot/master/buildbot/process/builder.py", line 349, in _startBuildFor
            ready = yield slavebuilder.prepare(self.builder_status, build)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
            result = context.call(ctx, function, *args, **kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
            return self.currentContext().callWithContext(ctx, func, *args, **kw)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
            return func(*args,**kw)
          File "dockerslave.py", line 74, in _start_instance
            found = self._image_exists(docker_client)
          File "dockerslave.py", line 65, in _image_exists
            for container in client.images():
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/docker/client.py", line 634, in images
            res = self._result(self._get(self._url("/images/json"), params=params),
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/docker/client.py", line 76, in _get
            return self.get(url, **self._set_request_timeout(kwargs))
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 469, in get
            return self.request('GET', url, **kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 457, in request
            resp = self.send(prep, **send_kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 569, in send
            r = adapter.send(request, **kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 407, in send
            raise ConnectionError(err, request=request)
        requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused'))

2014-11-18 12:34:20+0100 [-] slave <Build Trunk_ospd> can't build <LatentSlaveBuilder builder=u'Trunk_ospd'> after all; re-queueing the request
2014-11-18 12:34:20+0100 [-] releaseLocks(<DockerLatentBuildSlave u'ospd-python-slave'>): []
2014-11-18 12:34:20+0100 [-] while substantiating
        Traceback (most recent call last):
          File "/usr/lib/python2.7/threading.py", line 525, in __bootstrap
            self.__bootstrap_inner()
          File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
            self.run()
          File "/usr/lib/python2.7/threading.py", line 505, in run
            self.__target(*self.__args, **self.__kwargs)
        --- <exception caught here> ---
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
            result = context.call(ctx, function, *args, **kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
            return self.currentContext().callWithContext(ctx, func, *args, **kw)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
            return func(*args,**kw)
          File "dockerslave.py", line 74, in _start_instance
            found = self._image_exists(docker_client)
          File "dockerslave.py", line 65, in _image_exists
            for container in client.images():
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/docker/client.py", line 634, in images
            res = self._result(self._get(self._url("/images/json"), params=params),
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/docker/client.py", line 76, in _get
            return self.get(url, **self._set_request_timeout(kwargs))
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 469, in get
            return self.request('GET', url, **kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 457, in request
            resp = self.send(prep, **send_kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 569, in send
            r = adapter.send(request, **kwargs)
          File "/home/benoit/buildbot/openvas/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 407, in send
            raise ConnectionError(err, request=request)
        requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused'))

We should at least catch the exception.

Change History (9)

comment:1 Changed 3 years ago by Ben

I tried catching the request.session.ConnectionError, and raising a interfaces.LatentBuildSlaveFailedToSubstantiate instead ...

It doesn't get better, the traceback also get reported.

@djmitche, any idea how to indicate to the master that the slave failed to substantiate ?

comment:2 Changed 3 years ago by sa2ajj

raise interfaces.LatentBuildSlaveFailedToSubstantiate?

comment:3 Changed 3 years ago by sa2ajj

I should have read the first part :(

Maybe, you are raising it at the wrong moment?...

comment:4 Changed 3 years ago by Ben

Copying the (new) traceback here to prevent ambiguity:

2014-11-18 21:46:12+0100 [-] starting build <Build GH-1288> using slave <LatentSlaveBuilder builder=u'GH-1288' slave=u'docker'>
2014-11-18 21:46:12+0100 [-] acquireLocks(slave <DockerLatentBuildSlave u'docker'>, locks [])
2014-11-18 21:46:12+0100 [-] substantiating slave <LatentSlaveBuilder builder=u'GH-1288' slave=u'docker'>
2014-11-18 21:46:12+0100 [-] Unable to connect to Docker: ('Connection aborted.', error(2, 'No such file or directory'))
2014-11-18 21:46:12+0100 [-] Buildslave docker detached from GH-1288
2014-11-18 21:46:12+0100 [-] while preparing slavebuilder:
	Traceback (most recent call last):
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks
	    current.result = callback(current.result, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 1155, in gotResult
	    _inlineCallbacks(r, g, deferred)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 1097, in _inlineCallbacks
	    result = result.throwExceptionIntoGenerator(g)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
	    return g.throw(self.type, self.value, self.tb)
	--- <exception caught here> ---
	  File "/Users/benoit/Python/buildbot/master/buildbot/process/builder.py", line 349, in _startBuildFor
	    ready = yield slavebuilder.prepare(self.builder_status, build)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
	    result = context.call(ctx, function, *args, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
	    return func(*args,**kw)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 88, in _thd_start_instance
	    'Unable to connect to Docker'
	buildbot.interfaces.LatentBuildSlaveFailedToSubstantiate: Unable to connect to Docker
	
2014-11-18 21:46:12+0100 [-] slave <Build GH-1288> can't build <LatentSlaveBuilder builder=u'GH-1288'> after all; re-queueing the request
2014-11-18 21:46:12+0100 [-] releaseLocks(<DockerLatentBuildSlave u'docker'>): []
2014-11-18 21:46:12+0100 [-] while substantiating
	Traceback (most recent call last):
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 783, in __bootstrap
	    self.__bootstrap_inner()
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
	    self.run()
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
	    self.__target(*self.__args, **self.__kwargs)
	--- <exception caught here> ---
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
	    result = context.call(ctx, function, *args, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
	    return func(*args,**kw)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 88, in _thd_start_instance
	    'Unable to connect to Docker'
	buildbot.interfaces.LatentBuildSlaveFailedToSubstantiate: Unable to connect to Docker
	

comment:5 Changed 3 years ago by sa2ajj

I think the traceback is a bit confusing: the while substantiating part actually took place earlier than the while preparing one, but it was reported in a wrong order.

(This does not answer the question and does not shed the light.)

Last edited 3 years ago by sa2ajj (previous) (diff)

comment:6 Changed 3 years ago by Ben

While we're at it, just got that traceback (probably because the server require TLS):

2014-11-18 23:06:56+0100 [-] starting build <Build GH-1288> using slave <LatentSlaveBuilder builder=u'GH-1288' slave=u'docker'>
2014-11-18 23:06:56+0100 [-] acquireLocks(slave <DockerLatentBuildSlave u'docker'>, locks [])
2014-11-18 23:06:56+0100 [-] substantiating slave <LatentSlaveBuilder builder=u'GH-1288' slave=u'docker'>
2014-11-18 23:06:56+0100 [-] Buildslave docker detached from GH-1288
2014-11-18 23:06:56+0100 [-] while preparing slavebuilder:
	Traceback (most recent call last):
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks
	    current.result = callback(current.result, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 1155, in gotResult
	    _inlineCallbacks(r, g, deferred)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 1097, in _inlineCallbacks
	    result = result.throwExceptionIntoGenerator(g)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
	    return g.throw(self.type, self.value, self.tb)
	--- <exception caught here> ---
	  File "/Users/benoit/Python/buildbot/master/buildbot/process/builder.py", line 349, in _startBuildFor
	    ready = yield slavebuilder.prepare(self.builder_status, build)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
	    result = context.call(ctx, function, *args, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
	    return func(*args,**kw)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 101, in _thd_start_instance
	    found = self._image_exists(docker_client)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 87, in _image_exists
	    for image in client.images():
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/docker/client.py", line 635, in images
	    True)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/docker/client.py", line 96, in _result
	    return response.json()
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/requests/models.py", line 793, in json
	    return json.loads(self.text, **kwargs)
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
	    return _default_decoder.decode(s)
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
	    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 383, in raw_decode
	    raise ValueError("No JSON object could be decoded")
	exceptions.ValueError: No JSON object could be decoded
	
2014-11-18 23:06:56+0100 [-] slave <Build GH-1288> can't build <LatentSlaveBuilder builder=u'GH-1288'> after all; re-queueing the request
2014-11-18 23:06:56+0100 [-] releaseLocks(<DockerLatentBuildSlave u'docker'>): []
2014-11-18 23:06:56+0100 [-] while substantiating
	Traceback (most recent call last):
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 783, in __bootstrap
	    self.__bootstrap_inner()
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
	    self.run()
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
	    self.__target(*self.__args, **self.__kwargs)
	--- <exception caught here> ---
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
	    result = context.call(ctx, function, *args, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
	    return func(*args,**kw)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 101, in _thd_start_instance
	    found = self._image_exists(docker_client)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 87, in _image_exists
	    for image in client.images():
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/docker/client.py", line 635, in images
	    True)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/docker/client.py", line 96, in _result
	    return response.json()
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/requests/models.py", line 793, in json
	    return json.loads(self.text, **kwargs)
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
	    return _default_decoder.decode(s)
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
	    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
	  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 383, in raw_decode
	    raise ValueError("No JSON object could be decoded")
	exceptions.ValueError: No JSON object could be decoded
	

comment:7 Changed 3 years ago by Ben

As a final note for today, when the stop_instance goes wrong, the consequences are pretty bad !

2014-11-18 23:22:27+0100 [-] Unhandled error in Deferred:
2014-11-18 23:22:27+0100 [-] Unhandled Error
	Traceback (most recent call last):
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 423, in errback
	    self._startRunCallbacks(fail)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 490, in _startRunCallbacks
	    self._runCallbacks()
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks
	    current.result = callback(current.result, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 1155, in gotResult
	    _inlineCallbacks(r, g, deferred)
	--- <exception caught here> ---
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/internet/defer.py", line 1097, in _inlineCallbacks
	    result = result.throwExceptionIntoGenerator(g)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
	    return g.throw(self.type, self.value, self.tb)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/base.py", line 783, in insubstantiate
	    yield d
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/threadpool.py", line 196, in _worker
	    result = context.call(ctx, function, *args, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
	    return func(*args,**kw)
	  File "/Users/benoit/Python/buildbot/master/buildbot/buildslave/docker.py", line 150, in _thd_stop_instance
	    docker_client.stop(instance['Id'])
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/docker/client.py", line 992, in stop
	    timeout=(timeout + self._timeout))
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/docker/client.py", line 73, in _post
	    return self.post(url, **self._set_request_timeout(kwargs))
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/requests/sessions.py", line 500, in post
	    return self.request('POST', url, data=data, json=json, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/requests/sessions.py", line 457, in request
	    resp = self.send(prep, **send_kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/requests/sessions.py", line 569, in send
	    r = adapter.send(request, **kwargs)
	  File "/Users/benoit/Python/buildbotEnv/lib/python2.7/site-packages/requests/adapters.py", line 420, in send
	    raise SSLError(e, request=request)
	requests.exceptions.SSLError: [Errno 1] _ssl.c:507: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
	

The slave keeps running, but doesn't accept any new build ...

comment:8 Changed 3 years ago by dustin

  • Milestone changed from undecided to 0.9.0

It looks like we're suffering from some ill-defined boundaries here. How can we encapsulate things with some fail-safe mechanisms to ensure that all exceptions are caught and that substantiation either completes or is rolled back (even in the case of master shutdown)?

comment:9 Changed 3 years ago by dustin

  • Milestone changed from 0.9.0 to 0.8.x
Note: See TracTickets for help on using tickets.