Bug 7824 - VNC server may disconnect when changing the system's monitor configuration
Summary: VNC server may disconnect when changing the system's monitor configuration
Status: NEW
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: VNC (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: LowPrio
Assignee: Bugzilla mail exporter
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-17 15:31 CET by William Sjöblom
Modified: 2022-01-25 12:36 CET (History)
0 users

See Also:
Acceptance Criteria:


Attachments

Description William Sjöblom cendio 2022-01-17 15:31:56 CET
I encountered a server-side disconnect while running the ThinLinc client on
Windows 10 with the following monitor layout and full screen enabled on all
monitors (happens regardless if the client is configured to use "Full screen
over all monitors" or "Full screen over selected monitors" with all monitors
selected):

                   +------------------+ +------------------+
 +---------------+ |                  | |                  |
 |               | |   1920 x 1200    | |   1920 x 1200    |
 |  1920 x 1080  | |                  | |                  |
 |               | |                  | |                  |
 +---------------+ +------------------+ +------------------+

Disconnecting the middle monitor results in the client exiting and reporting
"CConn: End of stream". Below is an excerpt from the xinit.log with verbose 
logging enabled:

Before disconnecting the center display:
> XserverDesktop: Got request for framebuffer resize to 5760x1200
>  XserverDesktop: 3 screen(s)
>  XserverDesktop:           5413 (0x00001525): 1920x1200+3840+0 (flags
>               0x00000000)
>  XserverDesktop:          10367 (0x0000287f): 1920x1080+0+0 (flags 0x00000000)
>  XserverDesktop:          27604 (0x00006bd4): 1920x1200+1920+0 (flags
>               0x00000000)
>  XserverDesktop:
>  RandR:       Insufficient screens. Need to create 2 more.
>  RandR:       Reconfiguring new output 'VNC-0' to 1920x1200+3840+0
>  RandR:       Reconfiguring new output 'VNC-1' to 1920x1080+0+0
>  RandR:       Reconfiguring new output 'VNC-2' to 1920x1200+1920+0
After disconnecting the center display:
>  XserverDesktop: Got request for framebuffer resize to 3840x1200
>  XserverDesktop: 2 screen(s)
>  XserverDesktop:          23449 (0x00005b99): 1920x1200+0+0 (flags 0x00000000)
>  XserverDesktop:           2347 (0x0000092b): 1920x1080+1920+0 (flags
>               0x00000000)
>  XserverDesktop:
>  RandR:       Resizing screen framebuffer to 3840x1200
>  RandR:       Temporarily disabling output 'VNC-0'
>  ComparingUpdateTracker: 774.33 kpixels in / 378.208 kpixels out
>  ComparingUpdateTracker: (1:2.04737 ratio)
>  RandR:       Reconfiguring new output 'VNC-1' to 1920x1200+0+0
>  RandR:       Reconfiguring new output 'VNC-0' to 1920x1080+1920+0
>  XserverDesktop: client gone, sock 14
>  VNCSConnST:  closing 127.0.0.1::33626: Desktop configured a different screen
>               layout than requested
Please note that this only disconnects the client but keeps the session alive.
I have attempted to reproduce the issue on other macOS and Linux without 
success.
Comment 1 William Sjöblom cendio 2022-01-17 17:22:53 CET
A small note about the log dump in my previous comment. As the logs hint about, the two remaining displays switch places after the center one is disconnected:

 +------------------+                  
 |                  | +---------------+
 |   1920 x 1200    | |               |
 |                  | |  1920 x 1080  |
 |                  | |               |
 +------------------+ +---------------+

This is expected as Windows remembers this layout from the last time it was configured with only these two monitors.
Comment 2 William Sjöblom cendio 2022-01-19 15:13:40 CET
I have now investigated this a bit further and it seems more severe than initially thought as it seems like mirroring also seems to provoke this exact VNC server disconnect:

                   +------------------+ +------------------+
 +---------------+ |                  | |                  |
 |               | |   1920 x 1200    | |   1920 x 1200    |
 |  1920 x 1080  | |                  | |                  |
 |               | |                  | |                  |
 +-+--+-----+----+ +---+--------+--+--+ +---------+-----+--+
   |  |     |          |        |  |              |     |
   |  |     +----------+        |  |              |     |
   |  |  mirror => no crash     |  |              |     |
   |  |                         |  |              |     |
   |  +-------------------------+  +--------------+     |
   |          mirror => crash       mirror => crash     |
   |                                                    |
   +----------------------------------------------------+
                           mirror => crash

I have also tried rearranging the resulting two logical monitor setup such that those do not switch places after a monitor pair has been mirrored. This did not make an impact. Similarly, changing the resolution of the 1200p monitors to 1080p (giving us three identical displays in terms of resolution) did not have an impact either.

Worth noting is that I couldn't provoke the issue when only using two monitors, which is why this was not caught when testing bug 7007. Using mirroring with three monitors is likely not that common, so the impact of this bug may not be especially significant.
Comment 3 William Sjöblom cendio 2022-01-24 09:05:32 CET
Overview
════════

  So after a fair bit of hair-pulling we've figured out the root cause
  of the bug. It basically boils down to outputs not getting disabled in
  certain scenarios which lead to these unintentionally enabled
  monitors ending up in the final screen layout which in turn triggers
  the "Desktop configured a different screen layout than requested"
  assertion to fail.

  On a protocol level, we have a couple of criteria for this to happen:
  1. At least two outputs must be set to be removed
  2. At least two of the disabled outputs must fit inside the new frame
     buffer
  3. At least one screen has to require reconfiguration


  As stated earlier, this only results in the user being kicked out
  which makes the issue less critical.

  Note that bug 7821 has a tendency to produce scenarios like the above
  even though it should be possible to provoke on all native client
  platforms in theory.


Concrete description of the issue
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  Programatically, this all mainly happens in `_setScreenLayout' in
  tigervnc/unix/common/randr.cxx which takes in the requested frame
  buffer dimensions, our desired layout as well as a reference to
  mapping between output ids and screens in our layout.

  What happens next is that the frame buffer is resized to the requested
  dimensions and all outputs outside this area get disabled. Note
  criteria #2 in the above list (and that not all of the outputs
  requested to be disabled will be disabled in this step).

  After this, monitors will be reconfigured. Calling
  `vncRandRReconfigureOutput' will result in a vnc hook being called
  which ends up calling `XserverDesktop::refreshScreenLayout' which, in
  turn, overwrite both the output id mapping reference above as well as
  the server's screen layout by calling `computeScreenLayout' in
  tigervnc/unix/common/randr.cxx. Since the one or more outputs was not
  disabled by when resizing the frame buffer, `computeScreenLayout'
  replace the output id mapping with a new one that additionally
  contains the wrongfully enabled output.

  This will lead to the last step that disables unused outputs in
  `_setScreenLayout' not disabling the wrongfully enabled output since
  it is present in the output id mapping, which in turn leads to the
  resulting layout differing from the requested layout and the "Desktop
  configured a different screen layout than requested" error.
Comment 4 Pierre Ossman cendio 2022-01-24 13:28:03 CET
We've managed to trigger another user visible behaviour stemming from the same underlying bug.

We start with this configuration:

> XserverDesktop: Got request for framebuffer resize to 3120x1920
> XserverDesktop: 2 screen(s)
> XserverDesktop:          16807 (0x000041a7): 1920x1080+0+0 (flags 0x00000000)
> XserverDesktop:      282475249 (0x10d63af1): 1200x1920+1920+0 (flags
>              0x00000000)
> XserverDesktop: 

And we reconfigure things to this:

> XserverDesktop: Got request for framebuffer resize to 5040x1920
> XserverDesktop: 3 screen(s)
> XserverDesktop:     1404280278 (0x53b39dd6): 1920x1080+1920+840 (flags
>              0x00000000)
> XserverDesktop:      893351816 (0x353f7788): 1920x1080+0+840 (flags
>              0x00000000)
> XserverDesktop:     1505795335 (0x59c09d07): 1200x1920+3840+0 (flags
>              0x00000000)
> XserverDesktop: 

At this point it configures the first two screens properly, but fails to reconfigure the third one.

The reason is similar to the previously named cases. When it reconfigures the first screen, it accidentally re-enables the second (randr) one, even though it hasn't been reconfigured properly.

Unlike the previous example the client isn't disconnected. The reason is that it detects that something is wrong here as it fails to find a free randr screen when it comes to configuring the third vnc screen. It then bails with an "invalid" response to the client. Which we also see in the logs:

> 2022-01-24T12:48:57: vncviewer[E]:  CConn:       SetDesktopSize misslyckades: 3

Unfortunately we don't log this properly in the server, which we probably should.

The end result is that the session gets a screen configuration that is partly in the old configuration, partly in the new. A user can easily recover by exiting full screen and re-enabling it again.

Note You need to log in before you can comment on or make changes to this bug.