Bug 8261 - Windows client can fail to connect if user has many sessions
Summary: Windows client can fail to connect if user has many sessions
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Client (show other bugs)
Version: trunk
Hardware: PC Windows
: P2 Normal
Target Milestone: 4.16.0
Assignee: Samuel Mannehed
URL:
Keywords: ossman_tester, relnotes
Depends on:
Blocks:
 
Reported: 2023-11-28 09:34 CET by Samuel Mannehed
Modified: 2024-01-08 16:05 CET (History)
0 users

See Also:
Acceptance Criteria:
MUST: * Client should succeed connecting to new sessions, even when there are many concurrent sessions, for that user.


Attachments

Description Samuel Mannehed cendio 2023-11-28 09:34:02 CET
When there are many sessions for the same user, the native client sometimes fails to connect on Windows. The error we get is:

> The SSH connection succeeded, but the ThinLinc server connection failed. Perhaps this server doesn’t run a ThinLinc server

But in tlclient.log we can see that the SSH connection to the master succeeded, the error message is misguiding.
Comment 2 Samuel Mannehed cendio 2023-11-28 09:41:36 CET
The more active sessions for the user I have, the easier it is to reproduce the issue. Note however that it seems quite racy and doesn't consistently happen. Even with ~50 active sessions, I don't see it every time.

With more debug logging added to tlclient I see the issue more frequently. For example, I added printing of the data we read in XmlRpcSocket::nbRead().
Comment 3 Samuel Mannehed cendio 2023-11-28 16:27:34 CET
The problem seems to be that MinGW's implementation for write() handles full pipes differently from how SSH expects it to. On Windows, we get ENOSPC instead of EAGAIN when we send too much data on a pipe.

See this thread:

https://lore.kernel.org/git/Yv3QcK4PPP9yJkKR@coredump.intra.peff.net/
Comment 8 Samuel Mannehed cendio 2023-11-29 08:28:54 CET
Attachment 1172 [details] is the same as attachment 721 [details] on bug 5936.

Attachment 1171 [details] fixes bug 7670.
Comment 13 Samuel Mannehed cendio 2024-01-03 14:46:16 CET
> MUST:
> 
> * Client should succeed connecting to new sessions, even when there are many concurrent sessions, for that user.

I tested with more than 100 concurrent sessions for the same user on Windows 10, did not get any errors.
Comment 14 Pierre Ossman cendio 2024-01-08 13:19:57 CET
I tested the issue with a slightly different approach. Using a modified client, I filled client_params with lots and lots of junk. This meant the response to get_user_sessions would be very large, just like with many sessions.

I was able to reproduce the issue most of the time on two different Windows 10 machines using 4.15.0. The issue was not reproducible using an updated client.

Commits and release notes look mostly good. There is unfortunately a bug where the return value from write() isn't checked, so it might be reacting on a stale errno value.
Comment 16 Samuel Mannehed cendio 2024-01-08 14:11:44 CET
(In reply to Pierre Ossman from comment #14)
> Commits and release notes look mostly good. There is unfortunately a bug
> where the return value from write() isn't checked, so it might be reacting
> on a stale errno value.

Fixed now.
Comment 17 Pierre Ossman cendio 2024-01-08 16:05:39 CET
I was unfortunately unable to reproduce the issue on 4.15.0 with 100 sessions. However, I could still reproduce it with a modified client_params.

With that test, I can confirm that the new client still works where 4.15.0 would not.

Note You need to log in before you can comment on or make changes to this bug.