Reproduce; Installing 4.8.1 and create a session, upgrade ThinLinc to 4.9.0 an reconnect to the session and it will fail. Creating a new session. #tail /var/log/vsmserver.log 2018-04-23 18:16:33 INFO vsmserver.license: Updating license data from disk to memory 2018-04-23 18:16:33 INFO vsmserver.license: License summary: 5 concurrent users. Hard limit of 6 concurrent users. 2018-04-23 18:16:33 INFO vsmserver.session: Loaded 1 sessions for 1 users from file 2018-04-23 18:16:56 INFO vsmserver.session: Session 127.0.0.1:1 for cendio has terminated. Removing. 2018-04-23 18:16:56 INFO vsmserver.session: User with uid 1000 (cendio) requested a new session 2018-04-23 18:16:57 INFO vsmserver: VSM Agent 127.0.0.1 successfully created a new session for cendio 2018-04-23 18:18:44 INFO vsmserver.session: Session 127.0.0.1:10 for cendio has terminated. Removing. 2018-04-23 18:18:44 INFO vsmserver.session: User with uid 1000 (cendio) requested a new session 2018-04-23 18:18:46 INFO vsmserver: VSM Agent 127.0.0.1 successfully created a new session for cendio 2018-04-23 18:18:46 WARNING vsmserver.session: Failed to get client ip for the session #tail /var/log/vsmagent.log 2018-04-23 18:13:04 INFO vsmagent: My public hostname is 10.47.253.181 2018-04-23 18:13:57 INFO vsmagent.session: Verified connectivity to newly started Xvnc for cendio 2018-04-23 18:16:24 INFO vsmagent: Got SIGTERM, signaling process to quit 2018-04-23 18:16:24 INFO vsmagent: Terminating. Have a nice day! 2018-04-23 18:16:27 INFO vsmagent: VSM Agent version 4.9.0 build 5758 started 2018-04-23 18:16:27 INFO vsmagent: My public hostname is 10.47.253.181 2018-04-23 18:16:56 WARNING vsmagent.session: Broken session for user cendio, tl-session pid 6729 is not tl-session 2018-04-23 18:16:57 INFO vsmagent.session: Verified connectivity to newly started Xvnc for cendio 2018-04-23 18:18:44 WARNING vsmagent.sessions: Broken session for user cendio, tl-session process 26653 does not exist 2018-04-23 18:18:45 INFO vsmagent.session: Verified connectivity to newly started Xvnc for cendio
tl-session with pid 6729 is running: # ps -ax | grep tl-session 4841 ? S 0:00 tl-session: cendio 6729 ? S 0:00 tl-session: cendio # ls -la /proc/6729/exe lrwxrwxrwx. 1 root root 0 Apr 23 18:19 /proc/6729/exe -> /opt/thinlinc/libexec/tl-session;5ade066b (deleted)
The code in handler_verifysessions.py does not handle the additional string ";5ade066b" which we have not seen before...
(In reply to comment #0) > Reproduce; > > Installing 4.8.1 and create a session, upgrade ThinLinc to 4.9.0 an reconnect > to the session and it will fail. Creating a new session. > Red Hat Enterprise Linux Server release 7.5 (Maipo)
(In reply to comment #3) > (In reply to comment #0) > > Reproduce; > > > > Installing 4.8.1 and create a session, upgrade ThinLinc to 4.9.0 an reconnect > > to the session and it will fail. Creating a new session. > > > > Red Hat Enterprise Linux Server release 7.5 (Maipo) Same problem on RHEL 7.4 after upgrading 4.8.0 to 4.9.0. > # cat /etc/redhat-release > Red Hat Enterprise Linux Server release 7.4 (Maipo) > # readlink /proc/1942/exe > /opt/thinlinc/libexec/tl-session;5adeeecd (deleted)
So that suffix seems to come from rpm: lib/fsm.c: rasprintf(&tid, ";%08x", (unsigned)rpmtsGetTid(ts)); However this is used for the new file, not the existing one. So the sequence is: 1. Unpack tl-session;12345678 2. mv tl-session;12345678 tl-session So I'm not sure how we ended up with a running process pointing to that file.
Looks like a kernel bug in RHEL, present in at least 7.4 and on. First: > # cp /usr/bin/python /usr/bin/mypython > # cp /usr/bin/python /usr/bin/mypython2 > # mypython In another terminal: > # pid=$(pidof mypython) > # readlink /proc/$pid/exe > /usr/bin/mypython > # mv /usr/bin/mypython2 /usr/bin/mypython > # readlink /proc/$pid/exe > /usr/bin/mypython2 (deleted) error: ^^^^^^^^^ On Fedora 27, I get the expected: > # readlink /proc/$pid/exe > /usr/bin/mypython > # mv /usr/bin/mypython2 /usr/bin/mypython > # readlink /proc/$pid/exe > /usr/bin/mypython (deleted)
(In reply to comment #6) > Looks like a kernel bug in RHEL, present in at least 7.4 and on. RHEL 7.0 is unaffected. > # uname -r > 3.10.0-123.6.3.el7.x86_64 > # cat /etc/redhat-release > Red Hat Enterprise Linux Server release 7.0 (Maipo) > # ./bug7161.sh > /proc/10667/exe before moving: > /tmp/tmp.OIgGHDSqqr/sleep > /proc/10667/exe after moving: > /tmp/tmp.OIgGHDSqqr/sleep (deleted)
(In reply to comment #7) > > # ./bug7161.sh FWIW; bug7161.sh: > #!/bin/bash > > set -e > > DIR=$(mktemp -d) > cp /usr/bin/sleep "${DIR}"/sleep > cp /usr/bin/sleep "${DIR}"/sleep2 > > "${DIR}"/sleep 10 & > PROC=$! > echo "/proc/${PROC}/exe before moving:" > readlink /proc/${PROC}/exe > mv "${DIR}"/sleep2 "${DIR}"/sleep > echo "/proc/${PROC}/exe after moving:" > readlink /proc/${PROC}/exe > rm -rf "${DIR}"
(In reply to comment #7) > (In reply to comment #6) > > Looks like a kernel bug in RHEL, present in at least 7.4 and on. > > RHEL 7.0 is unaffected. RHEL 7.2 is affected. > # uname -r > 3.10.0-327.13.1.el7.x86_64 > # cat /etc/redhat-release > Red Hat Enterprise Linux Server release 7.2 (Maipo) > # ./bug7161.sh > /proc/10584/exe before moving: > /tmp/tmp.vah05hF10P/sleep > /proc/10584/exe after moving: > /tmp/tmp.vah05hF10P/sleep2 (deleted)
RHEL 7.5 x86_64, thinlinc-vsm-4.9.0-5764.x86_64 Sessions are no longer discarded if their /proc/pid/exe link has ";01abcdef (deleted)" suffixes, as created by upgrading ThinLinc with RPM. HOWEVER: It's imperative that you restart the vsmagent service as quickly as possible after installing upgraded packages. If you do not do this in a timely fashion, previously scheduled session verification tasks will run old code in vsmagent that will effectively make any running sessions unreachable if a user disconnects. Take into account that the administrator _must_ run tl-setup, deal with the configuration changes that happened during the 4.9.0 cycle, wait for the SELinux module, CUPS, printers, etc. This can easily take a minute for an experienced ThinLinc Developer, so it's easy to imagine scenarios where this takes 2-5 minutes or more for Joe Schmoe, system administrator. As for me, I went for lunch during tl-setup and came back to an upgraded system that had lost sessions. I think these issues needs to be at least discussed and understood before this bug is resolved.
(In reply to comment #14) > RHEL 7.5 x86_64, thinlinc-vsm-4.9.0-5764.x86_64 > > Sessions are no longer discarded if their /proc/pid/exe link has ";01abcdef > (deleted)" suffixes, as created by upgrading ThinLinc with RPM. > > HOWEVER: > > It's imperative that you restart the vsmagent service as quickly as possible > after installing upgraded packages. > > If you do not do this in a timely fashion, previously scheduled session > verification tasks will run old code in vsmagent that will effectively make any > running sessions unreachable if a user disconnects. > > Take into account that the administrator _must_ run tl-setup, deal with the > configuration changes that happened during the 4.9.0 cycle, wait for the > SELinux module, CUPS, printers, etc. This can easily take a minute for an > experienced ThinLinc Developer, so it's easy to imagine scenarios where this > takes 2-5 minutes or more for Joe Schmoe, system administrator. > > As for me, I went for lunch during tl-setup and came back to an upgraded system > that had lost sessions. > > I think these issues needs to be at least discussed and understood before this > bug is resolved. --> bug 7163.
(In reply to comment #14) > RHEL 7.5 x86_64, thinlinc-vsm-4.9.0-5764.x86_64 > > Sessions are no longer discarded if their /proc/pid/exe link has ";01abcdef > (deleted)" suffixes, as created by upgrading ThinLinc with RPM. Also works on Debian 9 i386 with thinlinc-vsm-4.9.0-5764.
(In reply to comment #16) > (In reply to comment #14) > > RHEL 7.5 x86_64, thinlinc-vsm-4.9.0-5764.x86_64 > > > > Sessions are no longer discarded if their /proc/pid/exe link has ";01abcdef > > (deleted)" suffixes, as created by upgrading ThinLinc with RPM. > > Also works on Debian 9 i386 with thinlinc-vsm-4.9.0-5764. To clarify: Debian 9 with the linux-image-4.9.0-6-686-pae kernel does not exhibit the triggering problem. The new code works just as well in this scenario.
(In reply to comment #15) > (In reply to comment #14) > > RHEL 7.5 x86_64, thinlinc-vsm-4.9.0-5764.x86_64 > > > > Sessions are no longer discarded if their /proc/pid/exe link has ";01abcdef > > (deleted)" suffixes, as created by upgrading ThinLinc with RPM. > > > > HOWEVER: > > > > It's imperative that you restart the vsmagent service as quickly as possible > > after installing upgraded packages. > > > > If you do not do this in a timely fashion, previously scheduled session > > verification tasks will run old code in vsmagent that will effectively make any > > running sessions unreachable if a user disconnects. > > > > Take into account that the administrator _must_ run tl-setup, deal with the > > configuration changes that happened during the 4.9.0 cycle, wait for the > > SELinux module, CUPS, printers, etc. This can easily take a minute for an > > experienced ThinLinc Developer, so it's easy to imagine scenarios where this > > takes 2-5 minutes or more for Joe Schmoe, system administrator. > > > > As for me, I went for lunch during tl-setup and came back to an upgraded system > > that had lost sessions. > > > > I think these issues needs to be at least discussed and understood before this > > bug is resolved. > > --> bug 7163. I'm suggesting that 7163 is reverted and replaced with this solution: --- vsm/thinlinc-vsm.spec.in (revision 33232) +++ vsm/thinlinc-vsm.spec.in (arbetskopia) @@ -59,12 +59,9 @@ rm -rf %pre -# Stop services before upgrading +# Workaround for https://bugzilla.redhat.com/show_bug.cgi?id=1571253 if [ $1 -gt 1 ] ; then - # remember that these are services from the OLD package - # even if we have changed the services in the NEW package - /opt/thinlinc/libexec/service vsmagent stop - /opt/thinlinc/libexec/service vsmserver stop + rm -f /opt/thinlinc/libexec/tl-session fi # Save install time mkdir -p /opt/thinlinc/etc/.upgrade.stamp I've verified that this solution works by using a modified version of "bug7161.sh". A full ThinLinc upgrade test remains.
Alternative approach committed now. Need to do final (re-)testing, but then we should be done.
Can't see any issues when testing build 5770 on Ubuntu 16.04. Works well and can reconnect to sessions started before an upgrade.
Could reproduce the issue when upgrading from 4.8.0 to 4.8.1 on RHEL7: 2018-05-03 14:34:04 WARNING vsmagent.session: Broken session for user cendio, tl-session pid 19240 is not tl-session And when upgrading from 4.8.1 to build 5770 I can can successfully reconnect to a session started prior to the upgrade. Looks good.