Bug 7089 - too large XML-RPC messages hangs Windows client
Summary: too large XML-RPC messages hangs Windows client
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Client (show other bugs)
Version: 1.3.1
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.9.0
Assignee: Pierre Ossman
Keywords: relnotes, samuel_tester
Depends on:
Reported: 2017-12-14 16:39 CET by Pierre Ossman
Modified: 2020-08-31 13:41 CEST (History)
2 users (show)

See Also:
Acceptance Criteria:


Description Pierre Ossman cendio 2017-12-14 16:39:22 CET
We have gotten a report that the Windows client will lock up if the user has too many sessions running. The last line of the log is:

> 2017-12-14T16:06:55: Calling XML-RPC method 'get_user_sessions'

The problem does not occur on other platforms.

The issue seems to be with the IPC between ssh and tlclient. Some more debug logging from XML-RPC shows this on Windows:

> 2017-12-14T16:06:55: XmlRpcSocket::nbRead: read/recv returned 4095.
> 2017-12-14T16:06:55: XmlRpcClient::readHeader: client has read 4095 bytes
> 2017-12-14T16:06:55: client read content length: 11205

Whilst on Linux it doesn't hang there:

> 2017-12-14T15:20:24: XmlRpcSocket::nbRead: read/recv returned 4095.
> 2017-12-14T15:20:24: XmlRpcClient::readHeader: client has read 4095 bytes
> 2017-12-14T15:20:24: client read content length: 11205
> 2017-12-14T15:20:24: XmlRpcSocket::nbRead: read/recv returned 4095.
> 2017-12-14T15:20:24: XmlRpcSocket::nbRead: read/recv returned 3057.
> 2017-12-14T15:20:24: XmlRpcClient::readResponse (read 11205 bytes)

The IPC consists of pipes, and increasing the buffer of the pipes makes everything start working. So the issue seems to be that we aren't handling full pipe buffers properly.

The tlclient side of things are very simple so I don't think the issue is there. So it's either in ssh, or the data gets dropped by Windows somewhere.
Comment 2 Pierre Ossman cendio 2017-12-15 16:09:29 CET
More debugging and the issue is in ssh. There is no way to check if a pipe is writeable, so we simply claim it always is. Microsoft's documentation claims that a pipe should be blocking by default, so the expected behaviour is intermittent hangs in ssh until tlclient empties the pipe buffer. However in practice it is non-blocking and write() returns ENOSPC.

Need to check if the documentation is wrong or if write() is misbehaving.
Comment 3 Pierre Ossman cendio 2017-12-15 16:44:13 CET
The documentation was wrong. The pipes are non-blocking by default. And setting them to blocking solves the issue.
Comment 4 Pierre Ossman cendio 2017-12-15 16:46:38 CET
Or maybe not... I found some code in ssh that sets things to non-blocking (haven't checked if it is called yet). However that code also figured out some way to check the outgoing buffer. There might be room to improve things.
Comment 7 Pierre Ossman cendio 2017-12-20 10:36:08 CET
Seems to work well now.

Tester should check that the Windows client can connect to a server where the user already has many (5+) sessions.
Comment 8 Samuel Mannehed cendio 2017-12-29 16:26:48 CET
I could reproduce the issue on Windows 10 with client build 5621 and can verify that it is fixed in build 5656. I could start 10 sessions with the same user and the same server without any problem.

Note You need to log in before you can comment on or make changes to this bug.