When there are many sessions for the same user, the native client sometimes fails to connect on Windows. The error we get is: > The SSH connection succeeded, but the ThinLinc server connection failed. Perhaps this server doesn’t run a ThinLinc server But in tlclient.log we can see that the SSH connection to the master succeeded, the error message is misguiding.
The more active sessions for the user I have, the easier it is to reproduce the issue. Note however that it seems quite racy and doesn't consistently happen. Even with ~50 active sessions, I don't see it every time. With more debug logging added to tlclient I see the issue more frequently. For example, I added printing of the data we read in XmlRpcSocket::nbRead().
The problem seems to be that MinGW's implementation for write() handles full pipes differently from how SSH expects it to. On Windows, we get ENOSPC instead of EAGAIN when we send too much data on a pipe. See this thread: https://lore.kernel.org/git/Yv3QcK4PPP9yJkKR@coredump.intra.peff.net/
Attachment 1172 [details] is the same as attachment 721 [details] on bug 5936. Attachment 1171 [details] fixes bug 7670.
> MUST: > > * Client should succeed connecting to new sessions, even when there are many concurrent sessions, for that user. I tested with more than 100 concurrent sessions for the same user on Windows 10, did not get any errors.
I tested the issue with a slightly different approach. Using a modified client, I filled client_params with lots and lots of junk. This meant the response to get_user_sessions would be very large, just like with many sessions. I was able to reproduce the issue most of the time on two different Windows 10 machines using 4.15.0. The issue was not reproducible using an updated client. Commits and release notes look mostly good. There is unfortunately a bug where the return value from write() isn't checked, so it might be reacting on a stale errno value.
(In reply to Pierre Ossman from comment #14) > Commits and release notes look mostly good. There is unfortunately a bug > where the return value from write() isn't checked, so it might be reacting > on a stale errno value. Fixed now.
I was unfortunately unable to reproduce the issue on 4.15.0 with 100 sessions. However, I could still reproduce it with a modified client_params. With that test, I can confirm that the new client still works where 4.15.0 would not.