We should evaluate an entirely new arcitechture for VSM. The problem with the current one is that we need to authenticate the user three times: First, when the first SSH tunnel is created. Secondly, when the VSM server verifies that the correct user is requesting a new session. Third, when the second SSH tunnel is created to the terminal server. Authenticating three times means serious problems with security products like SecurID, RADIUS etc. By making two SSH connections, the login progress also takes a bit more time than it has to. We should try to come up with a better approach, which only requires one authentication. Here is one idea: 1. The client makes a XML-RPC call to the VSM server (without using SSH encryption). The call is get_server(username). Since only the username and not the password is sent, we do not need encryption. The VSM server returns the terminal server to connect to. If the user already has a session on some server, this server is returned. Otherwise, the server with lighest load is returned. 2. The client makes a SSH tunnel to the terminal server (from step 1). It creates a tunnel for XML-RPC to the VSM agent. The client makes a XML-RPC call over this tunnel. The call is get_session(). This call creates a new session if no session exists, otherwise it returns an existing session. Note that no arguments are needed. VSM Agent "authenticates" the user by looking at the local TCP connection (between sshd and VSM agent): Which user owns the endpoint. VSM agent needs to communication with VSM server at this point; if a new session is created, this must be reported to VSM server. VSM agent returns session information, like the VNC password. 3. The client keeps the same SSH connection but modifies the tunnels dynamically. This is actually possible in recent versions of SSH (though we need to add some mechanism to control this from tlclient; we could use a pipe or something like that). The client keeps the tunnel for VSM agent, just in case. It also adds tunnels for VNC, sound, serial ports etc. The client can then start vncviewer and start the session. I think this model has several advantages, besides that it only needs one authentication point. Since the call get_server(username) to the VSM server is so lightweight, the risk of the master machine becoming an performance bottleneck is small. One important thing to consider is that we probably need to be backwards-compatible with the current architecture. I think this can be done. We could add the new XML-RPC calls but retaining the old ones. The VSM server still "owns" the session information. (With an implementation supporting both the old and new arch, both VSM server and VSM agent will be both client and server (to each other). But I don't think that is a problem.)
By implementing 1), we would send information in cleartext that will allow an attacker to find out if a specific username is valid or not, something that will be an issue at some possible installations.
Moved all 2.X bugs to 2.0.
Perhaps one alternative, rather than coming up with a whole new architecture, would be to handle single-authentication connections for "clusters" which only consist of one machine. While cluster configurations still have their advantages, it is feasible these days to use a single virtual machine instead, since virtualisation platforms take care of a lot of the redundancy and resource allocation tasks traditionally handled by our load balancer. Scaling in a virtual environment can also be done without having to add/remove machines from the cluster. Modern architectures are capable of addressing large amounts of resources in a single system. As mentioned in the bug description above, it is possible to dynamically allocate SSH tunnels without reconnecting. So in theory at least, we shouldn't have to connect twice to the same machine. This would give users of one-time authentication mechanisms like SecurID a way to use ThinLinc without having to install a caching RADIUS server, and remove an "unnecessary" step when connecting to a single machine. This may be more appropriate as a separate bug.
Bug 2545 has a better discussion as to the actual problem, so let's close this as a duplicate of that bug. *** This bug has been marked as a duplicate of bug 2545 ***