Bug 7777 - Upgrade FLTK to latest release
Summary: Upgrade FLTK to latest release
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Build system (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.14.0
Assignee: William Sjöblom
URL:
Keywords: frifl_tester, linma_tester, prosaic
Depends on:
Blocks: 7778
  Show dependency treegraph
 
Reported: 2021-10-14 16:09 CEST by William Sjöblom
Modified: 2022-01-24 09:31 CET (History)
3 users (show)

See Also:
Acceptance Criteria:


Attachments
crash dump (1.41 MB, text/plain)
2022-01-19 12:18 CET, Pierre Ossman
Details

Description William Sjöblom cendio 2021-10-14 16:09:51 CEST
We are currently running FLTK 1.3.5 with the latest release being 1.3.7.
Comment 1 William Sjöblom cendio 2021-10-15 16:38:01 CEST
CMake 3.2.3 or higher is required to build FLTK 1.3.7. We are currently
running version 2.8.12. Below is a table of several distributions that
we (more or less) want to stay compatible with and their version of
CMake:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Distribution                 CMake version 
────────────────────────────────────────────
 Ubuntu 18.04                        3.10.2 
 RHEL7         2.8.12 (3.17.5 through EPEL) 
 RHEL8                               3.18.2 
 SLES12                               3.5.2 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Comment 3 William Sjöblom cendio 2021-10-20 17:27:15 CEST
Since the last update, FLTK has introduced one occurrence of the __block storage modifier in the macOS code. This is not supported by our version of GCC and seems to be very much a work in progress still (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78352). The one usage found seems like it can be patched away with relative ease.
Comment 5 William Sjöblom cendio 2021-10-22 16:58:06 CEST
We apparently have a compatibility issue between CMake version 2.8.12 and
3.10.2. When we set `cmake_minimum_required(VERSION 2.8.12)' the cross-compiler
prefix detection (e.g. `x86_64-w64-mingw32') fails due to the format-string
changes in CMake 3.1.0.

There are the possible solutions that I've identified:
• The simplest one is bumping the minimum required CMake version to something
  above 3.1.0. This will be needed anyways as setting
  `cmake_minimum_required(VERSION 2.8.12)' is deprecated in current versions of
  CMake and will soon be dropped entirely.
• Hard-code the compiler prefixes in our toolchain file is also possible, but
  these variables seem to be pretty fragile to override as we would have to
  disable compiler detection entirely (see
  https://github.com/android/ndk/issues/222 where they perform similar hacks)
• Patch the top-most CMakeLists.txt to to enable `cmake_policy(SET CMP0053 NEW)'
  at the point the `project(...)' call is made. The policy is then to be
  restored.

The last two has the benefit of working both with our current version of CMake
and the new one, but in the long-term, we should still aim to raise the minimum
required CMake version as motivated by the first alternative above.
Comment 6 Samuel Mannehed cendio 2021-10-27 17:26:08 CEST
I have tested a server-bundle and a client-bundle built using new versions of the following cenbuild packages:

cendio-build-cmake-armhf-3.10.2-1.noarch.rpm
cendio-build-cmake-i386-3.10.2-1.noarch.rpm
cendio-build-cmake-osx64-3.10.2-1.noarch.rpm
cendio-build-cmake-win32-3.10.2-1.noarch.rpm
cendio-build-cmake-win64-3.10.2-1.noarch.rpm
cendio-build-cmake-x86_64-3.10.2-1.noarch.rpm
cendio-build-fltk-armhf-1.3.7-1.noarch.rpm
cendio-build-fltk-i386-1.3.7-1.noarch.rpm
cendio-build-fltk-osx64-1.3.7-1.noarch.rpm
cendio-build-fltk-win32-1.3.7-1.noarch.rpm
cendio-build-fltk-win64-1.3.7-1.noarch.rpm
cendio-build-fltk-x86_64-1.3.7-1.noarch.rpm
cendio-build-libarchive-i386-3.5.2-1.noarch.rpm
cendio-build-libarchive-native-armhf-3.5.2-1.noarch.rpm
cendio-build-libarchive-native-osx64-3.5.2-1.noarch.rpm
cendio-build-libarchive-native-win32-3.5.2-1.noarch.rpm
cendio-build-libarchive-native-win64-3.5.2-1.noarch.rpm
cendio-build-libarchive-x86_64-3.5.2-1.noarch.rpm
cendio-build-libjpeg-armhf-2.1.1-1.noarch.rpm
cendio-build-libjpeg-i386-2.1.1-1.noarch.rpm
cendio-build-libjpeg-osx64-2.1.1-1.noarch.rpm
cendio-build-libjpeg-win32-2.1.1-1.noarch.rpm
cendio-build-libjpeg-win64-2.1.1-1.noarch.rpm
cendio-build-libjpeg-x86_64-2.1.1-1.noarch.rpm
cendio-build-librhash-i386-1.4.2-1.noarch.rpm
cendio-build-librhash-native-armhf-1.4.2-1.noarch.rpm
cendio-build-librhash-native-osx64-1.4.2-1.noarch.rpm
cendio-build-librhash-native-win32-1.4.2-1.noarch.rpm
cendio-build-librhash-native-win64-1.4.2-1.noarch.rpm
cendio-build-librhash-x86_64-1.4.2-1.noarch.rpm
cendio-build-libuv-i386-1.42.0-1.noarch.rpm
cendio-build-libuv-native-armhf-1.42.0-1.noarch.rpm
cendio-build-libuv-native-osx64-1.42.0-1.noarch.rpm
cendio-build-libuv-native-win32-1.42.0-1.noarch.rpm
cendio-build-libuv-native-win64-1.42.0-1.noarch.rpm
cendio-build-libuv-x86_64-1.42.0-1.noarch.rpm

I ran the ThinLinc server on a CentOS 7 machine. But the main target here was the upgraded FLTK package, so I focused my testing on the ThinLinc Client:

━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━
  Grab & Focus  │ F34  Win10  mac11
────────────────┼───────────────────
 Fullscreen all │ ✓     ✓      ✓
────────────────┼───────────────────
 Fullscreen one │ ✓     ✓      ✓
────────────────┼───────────────────
 Minimize       │ ✓     ✓      ✓
────────────────┼───────────────────
 Click outside  │ ✓     ✓      ✗*
────────────────┼───────────────────
 Disabled grab  │ ✓     ✓      ✓
━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━

* Created bug 7782 (Not a regression)

━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━
    Gestures    │ F34  Win10  mac11
────────────────┼───────────────────
 Longpress      │ ✓     ✓      -
────────────────┼───────────────────
 Two-finger tap │ ✓     ✓      -
────────────────┼───────────────────
 Scroll         │ ✓     ✓      -
────────────────┼───────────────────
 Zoom           │ ✓     ✓      -
────────────────┼───────────────────
 Drag and drop  │ ✓     ✓      -
━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━


━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━
      Other stuff       │ F34  Win10  mac11
────────────────────────┼───────────────────
 Options window look OK │ ✓     ✓      ✓
────────────────────────┼───────────────────
 Middle mouse emulation │ ✓     ✓      ✓
━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━
Comment 7 William Sjöblom cendio 2021-11-01 09:01:26 CET
During the upgrade I made changes to a couple of FLTK patches:
• `fluid_as_example' - Only concerns which FLTK components to
  build. Errors caught at build-time.
• `option_cmake_support' - Only concerns which FLTK components to
  install. Errors caught at rpm build-time.
• `macos_gc' - Accepted by upstream in a slightly changed form. Requires
  retesting of bug 7539.
• `macos_multihead' - Accepted by upstream. Upstream is very much
  equivalent to our patch and require no additional testing.
• `text_plain' - Accepted by upstream with heavy refactoring. Requires
  retesting of bug 7627.
• `macos_blocks' - Newly introduced patch. Requires shallow testing of
  the file dialogs.

Apart from the above, a number of CMake-related patches were
added/changed. These should require no additional testing since these
should all fail at compile time. Apart from this, the tests presented in
comment 6 should suffice.
Comment 8 William Sjöblom cendio 2021-11-01 09:15:31 CET
I have retested bug 7539 (on both macOS 11 and 12), bug 7627 (on Fedora
34) and file dialogs for public key and drive redirection (on macOS
12). This testing was done using the same client- and server-bundle as
in comment 6 against a server running the Fedora 35 Workstation beta.
Comment 9 Pierre Ossman cendio 2021-11-03 14:08:18 CET
We've found two issues in FLTK that we need to patch around:

https://github.com/fltk/fltk/issues/287
https://github.com/fltk/fltk/issues/288

There is also a third that currently isn't a problem, but will be once we upgrade TigerVNC:

https://github.com/fltk/fltk/issues/286
Comment 10 William Sjöblom cendio 2021-11-03 16:52:30 CET
I have applied the patches linked in comment 9.

I tested a local macOS client build on macOS 12.0.1 with the FLTK patches applied and the fullscreen behavior is now back to normal.
Comment 13 William Sjöblom cendio 2021-11-04 09:13:35 CET
FLTK has now been upgraded and all dependencies have now been upgraded. Awaiting Jenkins client build #2249 and server build #2230 before marking as resolved. Those two build will be the candidates used for testing.
Comment 14 William Sjöblom cendio 2021-11-04 11:34:25 CET
The builds in comment 13 have now been completed and are ready for testing. The bug(s) identified in comment 6 remains to be tested. Marking as resolved.
Comment 15 William Sjöblom cendio 2021-11-05 11:15:03 CET
Upstream FLTK decided to backport some macOS fixes from the master
branch instead of applying the patches supplied by us, thus we need to
do some retesting:

Firstly I reproduced the bugs mentioned in comment 9 on macOS 12.0.1
with vncviewer in upstream TigerVNC and with a locally built ThinLinc
client without any of our new patches applied:

━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Component │ fltk/fltk/#287  fltk/fltk/#288  fltk/fltk/#286      
───────────┼────────────────────────────────────────────────────
 TigerVNC  │ reproduced      reproduced      reproduced         
 ThinLinc  │ reproduced [1]  reproduced      not reproduced [2] 
━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[1] Does not exhibit exactly the same behaviour as TigerVNC as the image
    only gets offset upwards when leaving fullscreen. I have verified
    that the #287 patch alone resolves this issue, so we can be fairly
    certain that this is just another manifestation of the same bug.

[2] See comment 9.

I then verified that these bugs were fixed when applying the patches
proposed for FLTK 1.3.8 and was unable to reproduce any of the previous
bugs:

━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Component │ fltk/fltk/#287  fltk/fltk/#288  fltk/fltk/#286 
───────────┼───────────────────────────────────────────────
 TigerVNC  │ fixed           fixed           fixed         
 ThinLinc  │ fixed           fixed           n/a           
━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Comment 17 William Sjöblom cendio 2021-11-05 11:26:14 CET
I have now upgraded the `cendio-build-fltk-*' packages to use the upstream FLTK patches instead of our own. Once Jenkins client build #2250 succeeds the macOS client is ready for testing!
Comment 18 William Sjöblom cendio 2021-11-05 13:14:49 CET
Builds were all successful. Marking as resolved.

A small note for the tester: since our current build system build has failed, we cannot verify that the build system changes made for this bug bootstraps correctly. So let's await a successful build system rebuild before marking this bug as closed after testing is done.
Comment 19 Frida Flodin cendio 2021-11-05 13:31:18 CET
Tested Client build 2249 on RHEL 7

 ✓ Client GUI looks and feels like before.
 ✓ Click around options, checkboxes, selection lists, file dialogs.
 ✓ Focus, dragging around options window.
 ✓ Click outside.
 ✓ Minimize.
 ✓ Middle mouse emulation.
 ✓ Server side shadowing pop up question, ThinLinc server on Fedora 34.
Comment 21 Frida Flodin cendio 2021-11-09 09:38:42 CET
Tested client build 2250 on macOS 12.0.1.

 ✓ Client GUI looks and feels like before.
 ✓ Click around options, checkboxes, selection lists, file dialogs.
 ✓ Focus, dragging around options window.
 ✗ Click outside [1]
 ✓ Minimize.
 ✓ Middle mouse emulation.

[1] bug 7782


Also made sure that the FLTK patches for [2] works. Going in and out of fullscreen  does not mess up the coordinates. The other issues mentioned in comment #9 can not  be tested now. [3] needs to modify the code and do some logging to see the issue and [4] we can only see with upgraded tigerVNC.

[2] https://github.com/fltk/fltk/issues/287
[3] https://github.com/fltk/fltk/issues/288
[4] https://github.com/fltk/fltk/issues/286
Comment 22 Frida Flodin cendio 2021-11-09 10:34:31 CET
Tested touch gestures on Windows 10 and Fedora 34 with client build 2250. Everything works just fine. 

 ✓ Longpress
 ✓ Two-finger tap
 ✓ Scroll
 ✓ Zoom
 ✓ Drag and drop

Also tested to copy non ASCII from a Wayland application into the session on Fedora 34. Works as expected. Everything should now be tested: 

 ✓ FLTK has been upgraded.
 ✓ This does not seem to cause any regressions.
 ✓ The upgrade does not cause any new bugs.

Waiting for the build system to successfully build before closing this bug.
Comment 23 Pierre Ossman cendio 2022-01-10 08:28:11 CET
We've gotten a crash report from the beta. Unclear why, but FLTK seems to be the only non-system thing in the stack:

https://community.thinlinc.com/t/thinlinc-4-14-0-beta/288/2

> 23  tlclient                      	       0x1000c3cc3 fl_pie(int, int, int, int, double, double) + 51587
Comment 24 William Sjöblom cendio 2022-01-13 08:40:47 CET
Regarding comment 23, please not that the fl_pie symbol in the stack has offset of about 50k. Since FLTK seemingly has debug symbols, I would expect a smaller offset if the error was related to FLTK.
Comment 25 Pierre Ossman cendio 2022-01-13 16:18:21 CET
It seems most symbol are lost when stripped, so that's just the last remaining one found. If I use an unstripped binary I can see that this address is in fl_mac_flush_and_wait().

The next call on the stack is [NSApplication _nextEventMatchingEventMask:]. That gets called from do_queued_events(), which is a static function so it would seem it got inlined in to fl_mac_flush_and_wait().

Very odd place to crash though. That call includes no objects as parameters, so nothing obvious that could be invalid.

So either some general memory corruption, or the NSApp object itself is bogus.

There are some notes about blocks in the stack as well. But those should not involve any of our code as our compiler cannot compile blocks.
Comment 26 Pierre Ossman cendio 2022-01-19 12:18:23 CET
Created attachment 1019 [details]
crash dump

The issue seems to happen way more frequently on M1. But it does seem to be possible on x86 as well, as shown by this crash dump.
Comment 27 Samuel Mannehed cendio 2022-01-19 14:54:51 CET
Upstream issue:

https://github.com/fltk/fltk/issues/373
Comment 29 Samuel Mannehed cendio 2022-01-19 15:49:18 CET
This should be fixed now.

Out of 15 tries using the beta client it crashed 13 times.
Out of 15 tries using the fixed client, it worked every try.
Comment 30 Samuel Mannehed cendio 2022-01-19 15:50:19 CET
Note that the fix is in cenbuild.
Comment 31 Linn cendio 2022-01-24 09:31:31 CET
Reproduced the issue mentioned in comment 23 with client build 2318, and tested the fix with client build 2321. The platform was macOS 12, M1.

I did 10 test runs for each version, and as can be seen from the table, I had no crashes after the fix.

 M1:
 Build | Crash | No crash |
 ------+-------+----------|
 2318  |   5   |     5    |
 2321  |   0   |    10    |

Note that when reproducing the issue, it crashed more often when quickly moving the session window, and I didn't see any crashes when moving it slowly. This is probably why I had less frequent crashes than in comment 29.


I also tested on macOS 12, x86. I managed to reproduce the crash once, but as mentioned it seems to happen much less frequent.

 x86:
 Build | Crash | No crash |
 ------+-------+----------|
 2318  |   1   |     9    |
 2321  |   0   |    10    |


Also checked the code, and it seems to be a minimal change. Closing.

Note You need to log in before you can comment on or make changes to this bug.