We also have bug 2919, but seems to be a bit different.
From Issue6885, we can also see this on vsmserver:
Traceback (most recent call last):
File "/opt/thinlinc/sbin/vsmserver", line 22, in ?
VSMServer ( sys . argv )
File "/opt/thinlinc/modules/thinlinc/vsm/vsmserver.py", line 141, in __init__
self . loop ( )
File "/opt/thinlinc/modules/thinlinc/vsm/async.py", line 430, in loop
o0o0oOOOo0oo = self . run_delayed_calls ( )
File "/opt/thinlinc/modules/thinlinc/vsm/async.py", line 388, in run_delayed_calls
I1i1i1 . func ( * I1i1i1 . args , ** I1i1i1 . kw )
File "/opt/thinlinc/modules/thinlinc/vsm/xmlrpc.py", line 163, in handle_timeout
self . parent . log . warning ( "Timeout reading socket %s, closing" % str ( self . socket . getpeername ( ) ) )
File "<string>", line 1, in getpeername
socket.error: (107, 'Transport endpoint is not connected')
Difficult problem that happens seldom. Postpone.
Having some trouble reproducing this bug easily, however it looks like the problem lies in the logging of handle_timeout() in xmlrpc.py:
def handle_timeout(self, dolog=True):
"""Handle read or write timeout on socket"""
if self.writable() and dolog:
self.parent.log.warning("Timeout writing socket %s, closing" % str(self.socket.getpeername()))
self.parent.log.warning("Timeout reading socket %s, closing" % str(self.socket.getpeername()))
If the timeout is due to the socket not being connected, socket.getpeername() will fail. I suppose we should either handle this case more elegantly within the method, or not bother with writing the peername in the logging message.
Perhaps we could log socket.getsockname() instead of socket.getpeername()
Fixed in r25565.
Autotest updated in r25572
getsockname() is a rather useless piece of information in most cases as it will just give you a local address of the machine, and the listening port of the server (i.e. things that are static and already well known).
The code should probably report both getsockname() and getpeername(), and changed to handle the fact that they can raise exceptions. Then we'll get useful information when possible.
(side note: to safely get the peer address in all cases, we would need to store it from the accept() call, but that's a lot more work)
There are 2 problems here:
1) The function asynchat.writable() can return True even if nothing is connected to the socket. Our xmlrpc.handle_timout() function assumed that if asynchat.writable() is True, then there must be something connected to it. If nothing is connected to the socket, then socket.getpeername() will raise an exception, which is what was causing the initial problem.
2) Our xmlrpc.handle_timout() function assumed that if the socket is writable, then it must be a write timeout, otherwise it must be a read timeout. This isn't necessarily true; in fact, using the current handle_timout() implementation, it doesn't seem possible to tell exactly *what* timed out. The log text has therefore been changed to "communcation" timeout rather than "read/write" timeout.
Fixed in 25956.
Autotests updated in r25959
(In reply to comment #11)
> Autotests updated in r25959
Fixed again in r25960.
Code review. Looks ok now.