A customer have reported a crash with a crash in Xvnc using ThinLinc 4.7.0 with CentOS 7.3, gnome-shell and google chrome / chromium. Customer described the steps to reproduce as: 1. Start the <browser> 2. Move the browser window so that it’s close to the top edge of the screen (you’ll understand why …) 3. Double click on the browsers title bar to maximize the window 4. Double click on the browsers title bar again to restore the normal window 5. Keep repeating steps 3 + 4, and the XVnc session will eventually terminate. It seems to take me about 15-20 seconds of vigorous clicking before it crashes. With the following backtrace: (EE) (EE) Backtrace: (EE) 0: /opt/thinlinc/libexec/Xvnc (xorg_backtrace+0x3f) [0x5d7c3f] (EE) 1: /opt/thinlinc/libexec/Xvnc (0x400000+0x1db0c9) [0x5db0c9] (EE) 2: /usr/lib64/libpthread.so.0 (0x7fe06ae97000+0xf370) [0x7fe06aea6370] (EE) 3: /opt/thinlinc/libexec/Xvnc (ProcXFixesGetCursorImageAndName+0x96) [0x4bb316] (EE) 4: /opt/thinlinc/libexec/Xvnc (Dispatch+0x28f) [0x58911f] (EE) 5: /opt/thinlinc/libexec/Xvnc (main+0x3ae) [0x49d51e] (EE) 6: /usr/lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fe06aaf7b35] (EE) 7: /opt/thinlinc/libexec/Xvnc (0x400000+0x9eec3) [0x49eec3] (EE) (EE) Segmentation fault at address 0x14
I have tested with freshly installed CentOS7 netinstall with "Gnome Desktop" group and ThinLinc 4.7.0. Added google chrome repo http://dl.google.com/linux/chrome/rpm/stable/x86_64 and installed google-chrome-stable. Logged into gnome shell session and launched google chrome, placed the title bar a few pixels down from gnome-shell top panel and performed subsequent double clicks to make chrome switch from maximized and windowed for >60 secs in several tries and chrome window posisitons. I also tried with firefox and switch to chrome but failed. I can't reproduce the problem.
I got and core file from the customer and got following info: Core was generated by `/opt/thinlinc/libexec/Xvnc'. Program terminated with signal 11, Segmentation fault. #0 ProcXFixesGetCursorImageAndName (client=0x2d583b0) at cursor.c:517 517 width = pCursor->bits->width; Missing separate debuginfos, use: debuginfo-install thinlinc-vnc-server-4.7.0-5280.x86_64 (gdb) bt #0 ProcXFixesGetCursorImageAndName (client=0x2d583b0) at cursor.c:517 #1 0x000000000058911f in Dispatch () at dispatch.c:432 #2 0x000000000049d51e in main (argc=22, argv=0x7ffd11055588, envp=<optimized out>) at main.c:295 (gdb) print *pCursor $1 = {bits = 0x0, foreRed = 0, foreGreen = 0, foreBlue = 0, backRed = 65535, backGreen = 65535, backBlue = 65535, refcnt = 0, devPrivates = 0x3656d00, id = 16778541, serialNumber = 526, name = 492} Notice that bits pointer is NULL and that is why it crashes. Also notice that the reference counter is 0.
Digging into the reference counter problem it seems that XFixes does not have checks for refcnt nor increasing the reference counter itself when storing a reference to CursorCurrent[] array in CursorDisplayCursor().
Found this bug [1] report for reference pointer problem in Xfixes related to cursor. I'm building a test Xnvc binary for the customer to test out with appropiated refence counter fix. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1357694
Also reported it upstream so we can get some more feedback: https://bugs.freedesktop.org/show_bug.cgi?id=100721
I still cannot reproduce this issue with 4.9.0. :/ I have confirmed we now have the fixes referenced upstream though. I guess that will have to be enough.