Bug 7692 - tl-setup requires Python 2
Summary: tl-setup requires Python 2
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Misc (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.13.0
Assignee: Frida Flodin
URL:
Keywords: linma_tester, prosaic, wilsj_tester
Depends on: 7508 7710
Blocks: 4586
  Show dependency treegraph
 
Reported: 2021-04-21 10:21 CEST by Frida Flodin
Modified: 2022-06-14 10:42 CEST (History)
4 users (show)

See Also:
Acceptance Criteria:


Attachments

Description Frida Flodin cendio 2021-04-21 10:21:10 CEST
It needs to be converted to work with Python 3.
Comment 157 Pierre Ossman cendio 2021-05-25 10:28:45 CEST
Tested a text install on a mimimal RHEL 8:

 ✓ sudo
 ✓ EULA
   ✓ Refusing exits
   ✓ Yes continues
 ✓ Just "y" as answer
 ✓ Empty answer
 ✓ dnf
   ✓ Basic requirements
     (installed nss, nspr, ghostscript, postfix, xauth and libX11)
   ✓ PyGObject and GTK+
   ✓ python-ldap not found (correct)
   ✓ SELinux development
 ✓ Admin email
   ✓ No address fails
   ✓ No @ fails
   ✓ Correct address continues
 ✓ tlwebadm
   ✓ Password set correctly
 ✓ printers
   ✓ Skipped when cups is not installed
 ✓ firewalld
   ✓ All ports correctly setup
 ✓ SELinux
   ✓ Devel installed (see above)
   ✓ Module built and installed
 ✓ Services enabled and started

Log file is consistent with above (got bug 7714 though).

Restarted tl-setup (still in text mode) after installing and enabling cups:

 ✓ Found everything installed in previous run
 ✓ Skipped install of python-ldap on "n"
 ✓ Admin email
   ✓ Showed previous value
 ✓ tlwebadm
   ✓ Showed previous value (as *:s)
 ✓ printer
   ✓ configured nearest
   ✓ configured thinlocal
Comment 158 Pierre Ossman cendio 2021-05-25 11:14:02 CEST
Tested again on minimal RHEL 8, but with GUI (so pygobject, gtk3 and xauth installed):

 ✓ sudo
 ✓ EULA
   ✓ Check box enables/disables forward
 ✓ dnf
   ✓ Basic requirements
     (installed nss, nspr, ghostscript, postfix)
   ✓ python-ldap not found (correct)
   ✓ Skipped install of python-ldap when unchecked
   ✓ SELinux development
 ✓ Admin email
   ✓ No address fails
   ✓ No @ fails
   ✓ Correct address continues
 ✓ tlwebadm
   ✓ Password set correctly
 ✓ printers
   ✓ Skipped when cups is not installed
 ✓ firewalld
   ✓ All ports correctly setup
 ✓ SELinux
   ✓ Devel installed (see above)
   ✓ Module built and installed
 ✓ Services enabled and started

Log file is consistent with above (still getting bug 7714).

Restarted tl-setup (still in text mode) after installing and enabling cups:

 ✓ Found everything installed in previous run
 ✓ Admin email
   ✓ Showed previous value
 ✓ tlwebadm
   ✓ Showed previous value (as *:s)
 ✓ printer
   ✓ configured nearest
   ✓ configured thinlocal
Comment 159 Frida Flodin cendio 2021-05-25 13:13:47 CEST
Tested migrate configuration on Ubuntu 18.04 and RHEL 7. Tested upgrade from tl-4.9.0 server where the following parameters was changed:

> vsmserver/terminalservers=127.0.0.1 1.2.3.4
> vsmserver/allowed_shadowers=root cendio
> vsmserver/explicit_agentselection=+agentoneusers:agentone cendio:agenttwo root:agentone

Everything works fine. Tested:
 ✓ parameters
 ✓ old
 ✓ ignore
Comment 161 Frida Flodin cendio 2021-05-25 14:22:48 CEST
To clarify, comment #159 was tested in text mode. Now I have also tested GUI-mode on RHEL 7.

Everything works fine. Tested:
 ✓ parameters
 ✓ old
 ✓ ignore
Comment 162 Frida Flodin cendio 2021-05-25 16:13:29 CEST
Tested yum package handling on RHEL 7 with build 2097. Tested both text and GUI mode.

 ✓ Installing packages and dependencies
 ✓ Not installing when user answers no
 ✓ Installation progress is shown in the log* 


* and for GUI as a progress bar with text.
Comment 163 Samuel Mannehed cendio 2021-05-26 16:13:01 CEST
Tested a text install with last night's build on a Ubuntu 20.04:

 ✓ sudo
 ✓ EULA
   ✓ Refusing exits
   ✓ Yes continues
 ✓ Just "y" as answer
 ✓ Empty answer
 ✓ dnf
   ✓ Basic requirements
     (installed citadel-server, libgcc1 libsieve2, libcitadel4)
   ✓ PyGObject and GTK+ already found
   ✓ python-ldap not found (correct)
 ✓ Admin email
   ✓ No address fails
   ✓ No @ fails
   ✓ Correct address continues
 ✓ tlwebadm
   ✓ Password set correctly
 ✓ printers
   ✓ Skipped by answering 'no'
 ✓ ufw
   ✓ Skipped since not enabled
 ✓ apparmor
   ✓ configured successfully
 ✓ Services enabled and started

Log file is consistent with above.

Restarted tl-setup (still in text mode) after enabling ufw (sudo ufw enable):

 ✓ Found everything installed in previous run
 ✓ Skipped install of python-ldap on "n"
 ✓ Admin email
   ✓ Showed previous value
 ✓ tlwebadm
   ✓ Showed previous value (as *:s)
 ✓ printer - answered 'Yes' on both
   ✓ configured nearest
   ✓ configured thinlocal
 ✓ ufw
   ✓ configured successfully

However, I got a resource warning on the second run after skipping python-ldap:

> /usr/lib/python3.8/codecs.py:268: ResourceWarning: unclosed file <_io.FileIO name=5 mode='wb' closefd=True>
>   self.errors = errors
> Object allocated at (most recent call last):
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/__init__.py", lineno 33
>     OOO0o000 = globals ( ) [ IIi1i111IiII ] . Backend ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/aptbackend.py", lineno 88
>     self . _aptcallback = i1Ii1i ( self )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/aptbackend.py", lineno 31
>     super ( ) . __init__ ( )
>   File "/usr/lib/python3/dist-packages/apt/progress/base.py", lineno 169
>     self.write_stream = os.fdopen(self.writefd, "w")  # type: io.TextIOBase
>   File "/usr/lib/python3.8/os.py", lineno 1023
>     return io.open(fd, *args, **kwargs)
> /usr/lib/python3.8/codecs.py:268: ResourceWarning: unclosed file <_io.FileIO name=4 mode='rb' closefd=True>
>   self.errors = errors
> Object allocated at (most recent call last):
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/__init__.py", lineno 33
>     OOO0o000 = globals ( ) [ IIi1i111IiII ] . Backend ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/aptbackend.py", lineno 88
>     self . _aptcallback = i1Ii1i ( self )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/aptbackend.py", lineno 31
>     super ( ) . __init__ ( )
>   File "/usr/lib/python3/dist-packages/apt/progress/base.py", lineno 170
>     self.status_stream = os.fdopen(self.statusfd, "r")  # type: io.TextIOBase # noqa
>   File "/usr/lib/python3.8/os.py", lineno 1023
>     return io.open(fd, *args, **kwargs)

It seems to stem from "/usr/lib/python3/dist-packages/apt/progress/base.py", so it's unclear what we can do about that.
Comment 164 Frida Flodin cendio 2021-05-27 14:22:25 CEST
Tested text and GUI install with build 2101 on SLES 12:

 ✓ sudo
 ✓ EULA
   ✓ Refusing exits (In GUI check box enables/disables forward)
   ✓ Yes continues
 ✓ Just "y" as answer
 ✓ Empty answer
 ✓ Server type
   ✓ Choosing master -> vsmagent + vsmserver is started
   ✓ Choosing agent -> vsmagent is started
 ✓ Admin email
   ✓ No address fails
   ✓ No @ fails
   ✓ Correct address continues
 ✓ tlwebadm
   ✓ Tlwebadm is not started if password left empty
   ✓ Rerunning and password set correctly
 ✓ printers
   ✓ Skipped when cups is not installed
 ✓ firewall
   ✓ Correctly skipped when no firewall configured
 ✓ SELinux
   ✓ Correctly skipped when no SELinux on system.
 ✓ Services enabled and started

Log file is consistent with above.

Restarted tl-setup after installing and enabling cups and SuSEfirewall2:

 ✓ Found everything installed in previous run
 ✓ Admin email
   ✓ Showed previous value
 ✓ tlwebadm
   ✓ Showed previous value (as *:s)
 ✓ printer
   ✓ configured nearest
   ✓ configured thinlocal
 ✓ firewall (SuSEfirewall2)
   ✓ configured correctly
Comment 168 Frida Flodin cendio 2021-05-28 13:52:18 CEST
Tested some special cases on SLES 12 with build 2103. Tested in both text and GUI.

  ✓ Old tlwebadm password saved as SHA1 -> enter new password
  ✓ Choosing agent as server type
    ✓ No email needed
    ✓ Extra info in the summary
    ✓ vsmserver not started
  ✓ sshd not running
    ✓ ask to install openssh
Comment 172 Frida Flodin cendio 2021-05-28 16:17:10 CEST
Tested some more special cases on SLES 12. Only tested in text-mode since it should be the exact same code that handles this:

  ✓ Configure PAM
    ✓ Create symlink to /etc/pam.d/sshd
    ✓ Already configured, skip
    ✓ Could not find suitable pam config file
    ✓ No /etc/pam.d found
  ✓ restart tlwebaccess
    ✓ upgrade from tl-4.11.0 (Manually terminating old SysV service)
Comment 173 Pierre Ossman cendio 2021-05-28 16:40:34 CEST
Tested a text install on a mimimal Fedora 34:

 ✓ sudo
 ✓ EULA
 ✓ dnf
   ✓ Basic requirements
     (installed nss, nspr, ghostscript, esmtp and xauth)
   ✓ PyGObject and GTK+
   ✓ python3-ldap for python-ldap (unintended)
   ✓ SELinux development
 ✓ Admin email
 ✓ tlwebadm, set password
 ✓ printers
   ✓ Skipped when cups is not installed
 ✓ firewalld
   ✓ All ports correctly setup
 ✓ SELinux
   ✓ Devel installed (see above)
   ✓ Module built and installed
 ✓ Services enabled and started

Log file is consistent with above.

Also tested with a GUI install (installed python3-gobject, gtk3 and xauth:

 ✓ sudo
 ✓ EULA
 ✓ dnf
   ✓ Basic requirements
     (installed nss, nspr, ghostscript, esmtp)
   ✓ python3-ldap for python-ldap (unintended)
   ✓ SELinux development
 ✓ Admin email
 ✓ tlwebadm, set password
 ✓ printers
   ✓ Skipped when cups is not installed
 ✓ firewalld
   ✓ All ports correctly setup
 ✓ SELinux
   ✓ Devel installed (see above)
   ✓ Module built and installed
 ✓ Services enabled and started
Comment 174 Pierre Ossman cendio 2021-05-28 16:42:56 CEST
The setup did lock up with 100% CPU during GUI installation of required packages at one point. Unfortunately it was not something I was able to reproduce so details are currently unknown. The log showed that the installation proceeded and completed despite the locked UI.
Comment 178 Pierre Ossman cendio 2021-06-01 09:14:29 CEST
Tested an automated install on RHEL 8. Minimal install with CUPS added.

 ✓ sudo
 ✓ EULA
 ✓ dnf
   ✓ Basic requirements
     (installed postfix and xauth)
   ✓ PyGObject and GTK+
   ✓ python-ldap not found (correct)
   ✓ SELinux development
 ✓ Admin email
 ✓ tlwebadm password
 ✓ printers
   ✓ configured nearest
   ✓ configured thinlocal
 ✓ firewalld
   ✓ All ports correctly setup
 ✓ SELinux
   ✓ Devel installed (see above)
   ✓ Module built and installed
 ✓ Services enabled and started

Log file is consistent with above

Re-ran it again with the same answer file and didn't see any issues there either.
Comment 180 Pierre Ossman cendio 2021-06-01 09:19:25 CEST
Tested handling of missing answers in the answers file:

 ✓ ask
 ✓ abort
 ✓ none (defaults to ask)
Comment 181 Samuel Mannehed cendio 2021-06-01 14:52:01 CEST
Tested again on Ubuntu 20.04, but with GUI (had to install python-gobject):

 ✓ sudo
 ✓ EULA
   ✓ Check box enables/disables forward
 ✓ apt
   ✓ Basic requirements
     (installed libgcc1, postfix*)
   ✓ python-ldap not found (correct)
   ✓ Skipped install of python-ldap when unchecked
 ✓ Selected "Agent"
 ✓ Did not have to specify admin email
 ✓ tlwebadm password skipped
 ✓ printers
   ✓ configured nearest
   ✓ configured thinlocal
 ✓ firewall page didn't show since ufw wasn't enabled
 ✓ apparmor configured
 ✓ Services enabled and started

Log file is consistent with above, and only the vsmagent and webaccess services were started. Bug 7715 is quite noticeable in this scenario.

* I encountered and fixed bug 7716.

Restarted tl-setup (still in GUI mode) and made a few different choices:

 ✓ Found everything installed in previous run
 ✓ Admin email
   ✓ No address fails
   ✓ No @ fails
   ✓ Correct address continues
 ✓ tlwebadm
   ✓ Password set correctly
 ✓ printers skipped
 ✓ apparmor skipped

Restarted tl-setup 3rd time in GUI mode after enabling ufw:
 ✓ Admin email
   ✓ Showed previous value
 ✓ tlwebadm
   ✓ Showed previous value (as *:s)
 ✓ ufw
   ✓ All ports correctly setup

All looks fine.
Comment 182 Niko Lehto cendio 2021-06-01 15:54:37 CEST
Tested on minimal Ubuntu 18.04 with CUPS installed. I tested this in both text and GUI mode using server build 2108:

✓ sudo
✓ EULA
✓ dnf
 ✓ Basic requirements
 ✓ PyGObject and GTK+
 ✓ python-ldap
 ✓ SELinux
✓ Admin email
✓ tlwebadm password
✓ printers
 ✓ configured nearest
 ✓ configured thinlocal
✓ firewalld
 ✓ All ports correctly setup
✓ AppArmor
✓ Services enabled and started

Also tested unsupperviced tl-setup with an answer file.

All looks good.
Comment 183 Pierre Ossman cendio 2021-06-01 16:33:11 CEST
NFS detection was broken (bug 7718). Retested RHEL 8 now that it works and NFS installation works in both GUI and text mode.
Comment 184 Samuel Mannehed cendio 2021-06-01 17:12:56 CEST
Also tested on Ubuntu 21.04 with GUI:

 ✓ sudo
 ✓ EULA
   ✓ Check box enables/disables forward
 ✓ Selected "Master"
 ✓ apt
   ✓ Basic requirements
     (installed ghostscript, postfix, ssl-cert)
   ✓ python-ldap not found (correct)
   ✓ Skipped install of python-ldap when unchecked
 ✓ Admin email
   ✓ No address fails
   ✓ No @ fails
   ✓ Correct address continues
 ✓ tlwebadm
   ✓ Password set correctly
 ✓ printers couldn't be configured since cups wasn't installed
 ✓ firewall page didn't show since ufw wasn't enabled
 ✓ apparmor configured
 ✓ Services enabled and started

Restarted tl-setup after installing cups and enabling ufw:

 ✓ printers
   ✓ configured nearest
   ✓ configured thinlocal
 ✓ ufw
   ✓ All ports correctly setup

Everything worked well.
Comment 185 Pierre Ossman cendio 2021-06-02 09:27:57 CEST
Tested tl-setup without an installer backend on RHEL 8 (hacked dnfhelper to report no support):

 ✓ Basic requirements
 ✓ NFS support
 ✓ GTK/PyGObject
 ✓ Python LDAP
 ✓ SELinux

Tested both GUI and text mode. Found bug 7720 though.
Comment 187 Frida Flodin cendio 2021-06-03 08:51:54 CEST
Found an issue with locking and logging in zypperbackend when testing on SLES 12. See output from text-mode below. This is printed to stdout, not the log file.

The issue arised after we started using files and processes as context managers in zypperbackend. The issue is probably not new but Python 3 is nice to print it out for us? I could reproduce this every time I ran tl-setup if I turned off sshd. This will trigger tl-setup to try installing openssh even if its installed already.


> Automatically install the necessary packages [Yes/no]?
> 
> Resolving packages and dependencies... --- Logging error ---
> Traceback (most recent call last):
>   File "/usr/lib64/python3.4/logging/__init__.py", line 982, in emit
>     self.flush()
>   File "/usr/lib64/python3.4/logging/__init__.py", line 962, in flush
>     self.stream.flush()
> RuntimeError: reentrant call inside <_io.BufferedWriter name='/var/log/tlsetup.log'>
> Call stack:
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 309, in <module>
>     i1i1iiII ( )
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 269, in i1i1iiII
>     O0ooOO0O0O0O = ooOo00oOo0Ooo . run ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 109, in run
>     return self . _run_text ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 127, in _run_text
>     Oo0 ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/python.py", line 59, in <lambda>
>     oO0 )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 74, in generic_text_installer
>     oo0 = ii1Ii1I . run ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 435, in run
>     ooO0000Ooo0O = oOoO00 ( self . __backend )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 121, in oOoO00
>     ooO0000Ooo0O = backend . resolve ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 462, in resolve
>     self . lock ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 381, in lock
>     if self . _try_lock ( ) :
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 363, in _try_lock
>     logging . debug ( "Creating zypper lock..." )
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1849, in debug
>     root.debug(msg, *args, **kwargs)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1267, in debug
>     self._log(DEBUG, msg, args, **kwargs)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1414, in _log
>     self.handle(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1424, in handle
>     self.callHandlers(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1486, in callHandlers
>     hdlr.handle(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 853, in handle
>     self.emit(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1046, in emit
>     StreamHandler.emit(self, record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 982, in emit
>     self.flush()
>   File "/usr/lib64/python3.4/logging/__init__.py", line 962, in flush
>     self.stream.flush()
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 297, in __del__
>     self . unlock ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 393, in unlock
>     logging . debug ( "Unlocking zypper..." )
> Message: 'Unlocking zypper...'
> Arguments: ()
> --- Logging error ---
> Traceback (most recent call last):
>   File "/usr/lib64/python3.4/logging/__init__.py", line 982, in emit
>     self.flush()
>   File "/usr/lib64/python3.4/logging/__init__.py", line 962, in flush
>     self.stream.flush()
> RuntimeError: reentrant call inside <_io.BufferedWriter name='/var/log/tlsetup.log'>
> Call stack:
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 309, in <module>
>     i1i1iiII ( )
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 269, in i1i1iiII
>     O0ooOO0O0O0O = ooOo00oOo0Ooo . run ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 109, in run
>     return self . _run_text ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 127, in _run_text
>     Oo0 ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/python.py", line 59, in <lambda>
>     oO0 )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 74, in generic_text_installer
>     oo0 = ii1Ii1I . run ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 435, in run
>     ooO0000Ooo0O = oOoO00 ( self . __backend )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 121, in oOoO00
>     ooO0000Ooo0O = backend . resolve ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 462, in resolve
>     self . lock ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 381, in lock
>     if self . _try_lock ( ) :
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 363, in _try_lock
>     logging . debug ( "Creating zypper lock..." )
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1849, in debug
>     root.debug(msg, *args, **kwargs)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1267, in debug
>     self._log(DEBUG, msg, args, **kwargs)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1414, in _log
>     self.handle(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1424, in handle
>     self.callHandlers(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1486, in callHandlers
>     hdlr.handle(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 853, in handle
>     self.emit(record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 1046, in emit
>     StreamHandler.emit(self, record)
>   File "/usr/lib64/python3.4/logging/__init__.py", line 982, in emit
>     self.flush()
>   File "/usr/lib64/python3.4/logging/__init__.py", line 962, in flush
>     self.stream.flush()
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 297, in __del__
>     self . unlock ( )
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/zypperbackend.py", line 407, in unlock
>     logging . debug ( "Removed zypper lock" )
> Message: 'Removed zypper lock'
> Arguments: ()
> done.
> 
> All packages needed seem to already be installed.
> 
> Administrator Contact
> =====================
Comment 188 Pierre Ossman cendio 2021-06-03 08:56:48 CEST
The locking for the package backend is a bit unclear. It is there to prevent external changes between showing the user what we intend to do, and actually doing it. However the current design is partly implicit and partly explicit, which is just confusing.

Let's see if we can clean this up in a way that we can avoid using __del__() and hence the above problem.
Comment 189 Pierre Ossman cendio 2021-06-04 13:10:38 CEST
It seems we can also simplify the locking a bit in the apt backend. Upstream cleaned out their locking handling in 2018 so that you don't have to release the lock when doing the installation (creating a race). They also backported the fixes to older distributions such as Ubuntu 16.04.
Comment 195 Frida Flodin cendio 2021-06-04 16:11:23 CEST
Tested that the correct openssh package is installed if sshd is not running on:

 ✓ SLES 15 
 ✓ Fedora 34
 ✓ Ubuntu 20.10
 ✓ RHEL 8
 ✓ Debian 10
Comment 203 Pierre Ossman cendio 2021-06-07 10:53:51 CEST
Locking in the package handling has now been cleaned up and should work better. Tested on SLES 12 and Ubuntu 20.04 and didn't see any issues.
Comment 204 Samuel Mannehed cendio 2021-06-07 13:28:01 CEST
I have tested the NFS module on SLES 12 and SLES 15:

 ✓ The nfs-client package was installed
 ✓ The NFS module showed properly with missing packages in GUI mode
 ✓ The NFS module showed properly with missing packages in text mode
 ✓ The NFS module did not show once the packages were available
 ✓ Local drives work

In combination with the testing done on bug 7718, the NFS module is now fully tested.
Comment 206 Frida Flodin cendio 2021-06-07 14:36:04 CEST
Made sure that we look for the right package when installing the basic requirements on Ubuntu and Debian. (with rpm-based package handling only the library name is needed, so no need to check there.)

Checked every library and binary that we need on Ubuntu 20.10 and Debian 10.
Comment 207 Niko Lehto cendio 2021-06-08 14:19:50 CEST
Tested the following on jenkins build 2119 on SLES15:

✓ sudo
✓ EULA
 ✓ zypper
   ✓ Basic requirements
   ✓ PyGObject and GTK+
   - python-ldap (Will be fixed at later point of conversion)
   ✓ SELinux development
 ✓ Admin email
 ✓ tlwebadm, set password
 ✓ printers
   ✓ Skipped when cups is not installed
 ✓ firewalld
   ✓ All ports correctly setup
 ✓ AppArmor
 ✓ Services enabled and started

Also tested that we locked zypper during our various installation steps.
Comment 209 Pierre Ossman cendio 2021-06-08 14:39:48 CEST
Checked that /etc/pam.d/thinlinc is set up correctly on:

 ✓ RHEL 8 (-> sshd)
 ✓ SLES 12 (-> sshd)
 ✓ Ubuntu 20.04 (-> sshd)
Comment 211 Pierre Ossman cendio 2021-06-08 14:55:09 CEST
Checked disabled firewall on RHEL 8:

 ✓ No firewall step if firewalld is disabled
Comment 212 Pierre Ossman cendio 2021-06-08 15:54:36 CEST
Rechecked firewall setup after changing which() calls:

 ✓ RHEL 8 (firewalld)
 ✓ SLES 12 (Susefirewall)
 ✓ Ubuntu 20.04 (ufw)
Comment 213 Niko Lehto cendio 2021-06-08 16:11:53 CEST
Tested that we configured AppArmor on SLES 12 with jenkins build 2119.
Comment 214 William Sjöblom cendio 2021-06-08 16:16:15 CEST
Checked that /etc/pam.d/thinlinc is set up correctly on:

 ✓ RHEL 7 (-> sshd)
Comment 216 Pierre Ossman cendio 2021-06-08 16:28:56 CEST
 ✓ Installation of PyGObject and GTK+ on Ubuntu 20.04

(wasn't tested earlier)
Comment 217 William Sjöblom cendio 2021-06-08 17:05:05 CEST
Checked that /etc/pam.d/thinlinc is set up correctly on:

 ✓ SLES15 (-> sshd)
 ✓ U18 (-> sshd)
Comment 218 Niko Lehto cendio 2021-06-09 13:51:09 CEST
Tested lokkit on RHEL 7 with build 2122. Looks good.
Comment 219 Pierre Ossman cendio 2021-06-09 14:55:02 CEST
Everything should be converted and tested now.
Comment 220 Pierre Ossman cendio 2021-06-09 17:17:25 CEST
Unfortunately they changed how exceptions are handled on threads from Python 3.8 and forward.

Previously exceptions on threads would call the normal sys.excepthook(). However in 3.8 they introduced sys.unraisablehook() instead:

https://bugs.python.org/issue36829

And in the same release they changed to calling that instead for threads:

https://bugs.python.org/issue37076



For us this means that crashes on the threads will not be caught by our except hook, which means the won't be logged, we won't show a dialog that there is a problem, and the UI is likely frozen with no way for the user to proceed or even exit cleanly.
Comment 224 William Sjöblom cendio 2021-06-10 16:39:27 CEST
I found another issue related to this bug.

On Ubuntu distributions missing the `/etc/os-release` file we crash with a stack trace when running tl-setup. I don't know if there are Ubuntus without this file. Nonetheless, our current workaround for these potentially hypothetical systems does not work after the conversion to Python 3.
Comment 227 Samuel Mannehed cendio 2021-06-10 17:41:42 CEST
The issue with thread exceptions in comment 220 has now been fixed. Handling of sys.unraisablehook() has been added.
Comment 237 Samuel Mannehed cendio 2021-06-11 14:11:22 CEST
(In reply to Samuel Mannehed from comment #227)
> The issue with thread exceptions in comment 220 has now been fixed. Handling
> of sys.unraisablehook() has been added.

The issue was only half-way fixed, the same code exists in tlinstaller and that needs fixing too.

Furthermore, vermin complains about unraisablehook, and removing that complaint isn't possible with the current version of vermin in our build system. We'll have to upgrade it.
Comment 241 Samuel Mannehed cendio 2021-06-11 14:38:09 CEST
The issues mentioned in comment 227 has now been addressed.
Comment 242 William Sjöblom cendio 2021-06-14 14:16:58 CEST
Another issue has been found:

When running tl-setup on Ubuntu Server 16.04 (running Python 3.5) with `LANG` and `LC_ALL` set to `POSIX` we are given the following error message

> Internal error
> The ThinLinc Server Setup Wizard will now terminate.

along with the following stack trace in /var/log/tlsetup.log

>   File "/opt/thinlinc/libexec/tl-setup.py", line 318, in <module>
>     i1i1iiII ( )
>   File "/opt/thinlinc/libexec/tl-setup.py", line 274, in i1i1iiII
>     Iii = I1i1I . run ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 99, in run
>     return self . _run_text ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 117, in _run_text
>     Oo0O0o0oO000 ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/eula.py", line 36, in oOoo0
>     wizard . text_print_wrapped ( i1Ii1i ( ) )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/eula.py", line 25, in i1Ii1i
>     IIiI = Oo000ooO0Oooo . read ( )
>   File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
>     return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 17218: ordinal not in range(128)

This seems to be related to loading the EULA. Since we are the authors of this file we already know its encoding and should thus not open it using the default character encoding.
Comment 243 Niko Lehto cendio 2021-06-14 14:58:36 CEST
(In reply to Samuel Mannehed from comment #241)
> The issues mentioned in comment 227 has now been addressed.

Tested this solution on a Fedora 33 and the changes fixes the issue. Looks good!
Comment 244 William Sjöblom cendio 2021-06-14 16:58:59 CEST
(In reply to William Sjöblom from comment #242)
> When running tl-setup on Ubuntu Server 16.04 (running Python 3.5) with
> `LANG` and `LC_ALL` set to `POSIX` we are given the following error message

The issue can also be reproduced on Fedora 33 (with python 3.5 installed) using:
> LANG=POSIX python3.5 /opt/thinlinc/libexec/tl-setup.py
Comment 246 William Sjöblom cendio 2021-06-15 10:41:22 CEST
The POSIX locale issue mentioned in comment 242 has now been fixed.
Comment 247 Frida Flodin cendio 2021-06-15 12:16:12 CEST
(In reply to William Sjöblom from comment #246)
> The POSIX locale issue mentioned in comment 242 has now been fixed.

I can verify this on SLES 12. I could reproduce the issue and with the fix it works just fine.
Comment 251 Niko Lehto cendio 2021-06-15 14:13:31 CEST
Testing of systemtype

[✓] OS_RELEASE (Tested multiple times during most of our testing)

## Fallback 1 ##
setup:
- Remove os-release

[✓] RHEL 8 (/etc/system-release-cpe)
  [✓] Distribution
  [✓] Family
  [✓] Version
[ ] Ubuntu (/etc/lsb-release)
    [ ] Distribution
    [ ]	Family
    [ ]	Version

## Fallback 2 ##
setup:
- Remove os-release
- Remove system-release-cpe

[ ] Fedora (/etc/fedora-release)
    [ ]	Distribution
    [ ]	Family
    [ ]	Version
[✓] RHEL 8 (/etc/redhat-release)
  [✓] Distribution
  [✓] Family
  [✓] Version
[ ] SUSE (/etc/SuSE-release) (Must be tested on < SELS15, maybe SLES12?)
  [ ] Distribution
  [ ] Family
  [ ] Version

## Fallback 3 ## (/etc/debian_version)
setup:
- Remove os-release
- Remove lsb-release

[✓] Debian (Ubuntu 18.04)

### Additional testing ###
[ ] 32-bit system
  [ ] correct packages installed
[✓] 64-bit system Comment #157
  [✓] correct packages installed

[✓] Detektion between dpkg/rpm
  [✓] dpkg Comment #181
  [✓] rpm Comment #157

[ ] If both dpkg and rpm exists on a system, use the one most likely in use (installed Alien)
  [ ] rpm unused
  [ ] dpkg unused
Comment 254 William Sjöblom cendio 2021-06-15 14:36:42 CEST
(In reply to Niko Lehto from comment #251)
> Testing of systemtype
> ...
> ### Additional testing ###
> [✓] 32-bit system
>   [✓] correct packages installed

These are already tested (see bug 7702, comment 56 and bug 7702, comment 61) and the detection of architecture works as expected.
Comment 255 Frida Flodin cendio 2021-06-15 14:37:04 CEST
Tested systemcheck on SLES 12. I removed /etc/os-release first. 
And it works fine:

(In reply to Niko Lehto from comment #251)
> Testing of systemtype
> ## Fallback 2 ##
> setup:
> - Remove os-release
> - Remove system-release-cpe
> 
> [✓] SUSE (/etc/SuSE-release) (Must be tested on < SELS15, maybe SLES12?)
>   [✓] Distribution
>   [✓] Family
>   [✓] Version

Got the expected in the log: 

> 2021-06-15 14:25:57,769: *** tl-setup started ***
> 2021-06-15 14:25:57,770: Architecture: x86_64
> 2021-06-15 14:25:57,770: Distribution: SLES
> 2021-06-15 14:25:57,770: Distribution version: 12
> 2021-06-15 14:25:57,770: Distribution family: SLE
> 2021-06-15 14:25:57,770: Package format: rpm
Comment 256 Samuel Mannehed cendio 2021-06-15 16:19:38 CEST
Distribution detection using lsb_release has now been fixed. Investigations show that it outputs data using the system's locale.

Looking at the man-pages for "/etc/os-release" shows that the fields we look for are strictly defined to only contain 0-9, a-z, ".", "_" and "-".

As far as "/etc/system-release-cpe" goes I gather that it only accepts ASCII with a few exceptions:

https://csrc.nist.gov/schema/cpe/2.3/cpe-dictionary-extension_2.3.xsd

The rest of the fallbacks sadly don't seem to have any documentation or specifications that can be found. SuSE refers to "/etc/SuSE-release" as the legacy way of detecting this type of information.

Let's assume that using the system's locale is fine for the rest of these.
Comment 257 William Sjöblom cendio 2021-06-15 16:28:35 CEST
Tested systemcheck on Ubuntu Server 20.04 with `/etc/os-release` removed and the logs look as expected:
> 2021-06-15 14:19:49,486: *** tl-setup started ***
> 2021-06-15 14:19:49,487: Architecture: x86_64
> 2021-06-15 14:19:49,487: Distribution: Ubuntu
> 2021-06-15 14:19:49,487: Distribution version: 20.04
> 2021-06-15 14:19:49,487: Distribution family: Ubuntu
> 2021-06-15 14:19:49,488: Package format: deb

Still, we seem to be leaving stdout of the subprocess open as we get the following ResourceWarning in the terminal:
 
> cendio@ubuntu2004:~/tl-4.12.1post-server$ python3 -Xdev -Xtracemalloc=5 /opt/thinlinc/libexec/tl-setup.py
> /opt/thinlinc/modules/thinlinc/systemtype.py:240: ResourceWarning: unclosed file <_io.TextIOWrapper name=3 encoding='UTF-8'>
>   i1II1 = OOO ( )
> Object allocated at (most recent call last):
>   File "/opt/thinlinc/modules/thinlinc/systemtype.py", lineno 320
>     OOO00OOo0oO0O ( )
>   File "/opt/thinlinc/modules/thinlinc/systemtype.py", lineno 318
>     IiII11i11II ( )
>   File "/opt/thinlinc/modules/thinlinc/systemtype.py", lineno 240
>     i1II1 = OOO ( )
>   File "/opt/thinlinc/modules/thinlinc/systemtype.py", lineno 106
>     OooOoo0OO0OO0 = subprocess . Popen ( [ "lsb_release" , "--id" , "--release" ] ,
>   File "/usr/lib/python3.8/subprocess.py", lineno 846
>     self.stdout = io.TextIOWrapper(self.stdout,
Comment 260 Linn cendio 2021-06-16 08:49:26 CEST
Tested build 2144 on Fedora 33 /etc/os-release and /etc/system-release-cpe removed. The logs looks like expected.

[✓] Fedora (/etc/fedora-release)
    [✓]	Distribution
    [✓]	Family
    [✓]	Version

* tlsetup.log:
2021-06-16 08:38:37,488: *** tl-setup started ***
2021-06-16 08:38:37,489: Architecture: x86_64
2021-06-16 08:38:37,489: Distribution: Fedora
2021-06-16 08:38:37,489: Distribution version: 33
2021-06-16 08:38:37,489: Distribution family: Fedora
2021-06-16 08:38:37,489: Package format: rpm
Comment 261 Samuel Mannehed cendio 2021-06-16 08:55:44 CEST
(In reply to Niko Lehto from comment #251)
> [ ] If both dpkg and rpm exists on a system, use the one most likely in use
> (installed Alien)
>   [ ] rpm unused
>   [ ] dpkg unused

This can't be tested on RPM systems with a supposedly unused DPKG. See bug 7726
Comment 262 Samuel Mannehed cendio 2021-06-16 08:56:47 CEST
(In reply to William Sjöblom from comment #257)
> Still, we seem to be leaving stdout of the subprocess open as we get the
> following ResourceWarning in the terminal:
> ...

This has been fixed now.
Comment 264 Niko Lehto cendio 2021-06-16 10:04:17 CEST
(In reply to Niko Lehto from comment #251)
Tested the following:

[✓] If both dpkg and rpm exists on a system, use the one most likely in use
   [✓] rpm unused (Ubuntu 20.10 + Alien)
   [✓] dpkg unused (RHEL 8 + Alien)

Systemtype should be fully tested now.
Comment 265 Niko Lehto cendio 2021-06-16 10:19:07 CEST
(In reply to Samuel Mannehed from comment #262)
> (In reply to William Sjöblom from comment #257)
> > Still, we seem to be leaving stdout of the subprocess open as we get the
> > following ResourceWarning in the terminal:
> > ...
> 
> This has been fixed now.

Verified the fix on Ubuntu 20.10. I could reproduce resource warning on a build before the fix, and this was gone in a build containing the changes.
Comment 267 Linn cendio 2021-06-17 09:41:29 CEST
Tested on Fedora 33 with build 2153. tl-setup ran without issues and the logs looks good, except for this deprecation warning: 
> Output (stderr):
>     thinlinc.te:154: Warning: mcs_killall() has been deprecated, please remove mcs_constrained() instead.
This deprecation warning is described in bug 5238. Closing.
Comment 277 Samuel Mannehed cendio 2021-08-06 17:23:43 CEST
Unfortunately commit 36780 introduced a crash in yum:

> 2021-08-06 11:14:20,141: Traceback (most recent call last):
> 2021-08-06 11:14:20,141:   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 303, in <module>
> 2021-08-06 11:14:20,141:     oOoO00 ( )
> 2021-08-06 11:14:20,141:   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 259, in oOoO00
> 2021-08-06 11:14:20,141:     I11 = O00oOoo0 . run ( )
> 2021-08-06 11:14:20,142:   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 99, in run
> 2021-08-06 11:14:20,142:     return self . _run_text ( )
> 2021-08-06 11:14:20,142:   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 117, in _run_text
> 2021-08-06 11:14:20,142:     Oo0O0o0oO000 ( )
> 2021-08-06 11:14:20,142:   File "/opt/thinlinc/modules/thinlinc/tlsetup/requirements.py", line 191, in <lambda>
> 2021-08-06 11:14:20,142:     i1ii1 )
> 2021-08-06 11:14:20,142:   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 61, in generic_text_installer
> 2021-08-06 11:14:20,142:     o0O0ooOoo00o = IIii . run ( )
> 2021-08-06 11:14:20,142:   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 444, in run
> 2021-08-06 11:14:20,142:     return self . _run ( IiIi1I11I )
> 2021-08-06 11:14:20,142:   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 532, in _run
> 2021-08-06 11:14:20,142:     O0ooooOoo0o0o = backend . install ( )
> 2021-08-06 11:14:20,143:   File "/opt/thinlinc/modules/thinlinc/packageinstaller/yumbackend.py", line 64, in install
> 2021-08-06 11:14:20,143:     callback = self . _yum_callback )
> 2021-08-06 11:14:20,143:   File "/opt/thinlinc/modules/thinlinc/packageinstaller/yumbackend.py", line 133, in I1i
> 2021-08-06 11:14:20,143:     OOO0o0O0o ( OO . decode ( locale . getpreferredencoding ( False ) ) )
> 2021-08-06 11:14:20,143:   File "/opt/thinlinc/modules/thinlinc/packageinstaller/yumbackend.py", line 89, in _yum_callback
> 2021-08-06 11:14:20,143:     ( Oooo0o0oO0 , I1i1i , O0ooooOoo0O , oo0oOo , oOOOo , iiIIIiII11I1 , oOo ) = line . split ( " " , 6 )
> 2021-08-06 11:14:20,143: ValueError: not enough values to unpack (expected 7, got 6)
Comment 278 Samuel Mannehed cendio 2021-08-06 17:25:09 CEST
The crash mentioned in comment #277 seems a bit random, have not found a reliable way to reproduce it yet, but it has happened roughly 5 times in my testing of the beta release on CentOS 7.

It has happened both when installing general requirements and when installing NFS dependencies.
Comment 283 Samuel Mannehed cendio 2021-08-11 15:53:05 CEST
When a user uses yum to install a package, but that package has required dependencies - yum will report that the requested package was "installed" or "updated" after all the required dependencies finished installing.

The event generated when yum reports "installed" or "updated" was what caused tl-setup to crash.
Comment 284 Samuel Mannehed cendio 2021-08-11 16:38:47 CEST
This has now been fixed by handling additional events from the yum API and also by making the code more robust in case there are cases we didn't consider.

A side effect of this fix is that we get more detailed info in tlsetup.log and in the GUI when installing packages on yum systems.

I have verified that the fixed version doesn't crash when installing or updating packages with dependencies - cases that did crash previously. I have checked both the text mode and the GUI mode. Tested on CentOS 7, RHEL 7 and Ubuntu 21.04.
Comment 285 William Sjöblom cendio 2021-08-12 12:43:36 CEST
I was able to reproduce the issue with Jenkins build #2244 on a non-updated CentOS 7 machine (created from our CentOS 7 VM template).
Comment 286 William Sjöblom cendio 2021-08-12 14:34:02 CEST
Tested running server Jenkins build #2245 running on a non-updated CentOS7 VM
created from our template and I no longer get the crash above.

On top of this shallow testing, I also tested the fallback (mentioned in
comment 284) for events we do not currently consider. This was done by
applying the following diff to
/usr/lib/python2.7/site-packages/yum/rpmtrans.py

> @@ -536,7 +536,7 @@
>                  self.display.event(name, 'repackaging',  bytes, total,
>                                  self.complete_actions, self.total_actions)
>              else:
> -                action = txmbr.output_state
> +                action = 666
>                  self.display.event(txmbr.po, action, bytes, total,
>                              self.complete_actions, self.total_actions)

With the patched rpmtrans.py, tl-setup still succeeds and logs a bunch of
"UnknownTransaction":s as expected. Closing.
Comment 287 Samuel Mannehed cendio 2021-08-16 17:05:16 CEST
Unfortunately tl-setup crashes on Debian 9 when trying to unlock apt after installing packages:

> Failure resolving packages.
> Traceback (most recent call last):
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 303, in <module>
>     oOoO00 ( )
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 259, in oOoO00
>     I11 = O00oOoo0 . run ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 99, in run
>     return self . _run_text ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 117, in _run_text
>     Oo0O0o0oO000 ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/python.py", line 59, in <lambda>
>     oO0 )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 61, in generic_text_installer
>     o0O0ooOoo00o = IIii . run ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/pkginsthelp.py", line 446, in run
>     IiIi1I11I . unlock ( ) 
>   File "/opt/thinlinc/modules/thinlinc/packageinstaller/aptbackend.py", line 157, in unlock
>     apt . apt_pkg . pkgsystem_unlock ( )
> apt_pkg.Error: E:Not locked

I can't reproduce the problem on Ubuntu 18.04 or Debian 10. The issue might be related to the python3-apt version, my Debian 10 machine has version 1.8.4.3 while my Debian 9 version has version 1.4.3.
Comment 289 Samuel Mannehed cendio 2021-08-16 17:28:15 CEST
It happens both in GUI and text mode.
Comment 292 Frida Flodin cendio 2021-08-17 12:52:23 CEST
I've made some minor changes to yumhelper. I have tested tl-setup and on a minimal RHEL 7 and everything is installed correctly. Tested build 2250.
Comment 295 William Sjöblom cendio 2021-08-18 15:53:16 CEST
I have tested the changes mentioned in comment 292 both with and without the diff in comment 286 applied on CentOS 7 and everything works as intended.

Thus, the changes introduced by Frida in comment 292 can be considered tested. Before marking this bug as closed, the changes for the problem described in comment 287 need to be finalized and tested.
Comment 296 Samuel Mannehed cendio 2021-08-19 09:24:59 CEST
A workaround for the bug in the old apt API (mentioned in comment 287) has been committed.
Comment 297 Linn cendio 2021-08-19 13:25:47 CEST
Tested server build 2252 on Debian 9, and tl-setup no longer crashes after installing packages (comment 287). Also tested the same server build on Debian 10 to make sure it still worked correctly. 

Debian 9:
  ✓ GUI mode
  ✓ Text mode

Debian 10:
  ✓ GUI mode
  ✓ Text mode
Comment 298 Samuel Mannehed cendio 2021-08-19 14:02:29 CEST
I did some more digging to figure out in the exact version that had fixed the python3-apt bug.

A simple python test program can be used to reproduce the issue:

 import apt
 c = apt.Cache()
 apt.apt_pkg.pkgsystem_lock()
 c.open()
 apt.apt_pkg.pkgsystem_unlock()

The unlock attempt will crash on python3-apt v1.4.3.

Using strace I found that the system lock in question is actually /var/lib/dpkg/lock. And then using `inotifywait -e close /var/lib/dpkg/lock` and gdb I could trace the unlocking to happen within apt.Cache.open() after a new "DepCache" is created. The DepCache object is created from C++ code.

The commit below is probably the one that fixes it, and if that's the case the first version including it was v1.7 of python-apt and libapt-pkg. I can't find any system where v1.7 is available to test however.

https://salsa.debian.org/apt-team/apt/-/commit/e02297b8e22dae04872fe6fab6dba966de65dbba

I'm not 100% sure however since I couldn't get my own build of libapt-pkg to include point to the source code files. But a lot of pkgCacheFile objects are thrown around in the python3-apt code and it's very likely the destructor is called somewhere here.

Since I'm not completely sure the issue is fixed in v1.7 I'll leave the comment in the code as is - saying that it's fixed inbetween v1.4.3 and v1.8.4.3.
Comment 299 Pierre Ossman cendio 2021-08-23 15:39:32 CEST
To clarify, the issue occurred when APT 1.7 cleaned up its locking to avoid some races, and python-apt were forced to adapt with these commits:

https://salsa.debian.org/apt-team/python-apt/-/commit/bb995b19276331ecf21e1695f8eca5ee10d15780
https://salsa.debian.org/apt-team/python-apt/-/commit/7064f1daabf3d9b30ae89561632b1307f9e147bd

which were included in python-apt 1.7.0.

However these fixes were also backported to the old 1.6 branch, used by Ubuntu 18.04:

https://salsa.debian.org/apt-team/python-apt/-/commit/e1dbad215ed448afcbf06435c73d55397ce839a6
https://salsa.debian.org/apt-team/python-apt/-/commit/867fa38e742413236ca70640b48483070024a5f1

And also to the old 1.1 branch, used by Ubuntu 16.04:

https://salsa.debian.org/apt-team/python-apt/-/commit/0107dc593d39312e2cc1817ad3a259139e270b48
https://salsa.debian.org/apt-team/python-apt/-/commit/c2ffb6c515cf7aa93d476b7f7b3bb41e9e55e9cd

However, for some reason the 1.4 branch was not fixed and unfortunately this is the branch used by Debian 9.
Comment 300 Pierre Ossman cendio 2021-08-25 14:21:34 CEST
The fix in r37346 is not entirely correct, but it works in practice so we'll leave it be until we can drop support for Debian 9.

The problem is that our current code is not compatible with how 1.4 is supposed to work, but because of a bug it happens to work anyway.

The incompatibility is that in the old way (that 1.4 still uses), you have to release the lock before calling Cache.commit(), as that function will also want to grab the lock. Failing to do so will dead lock. And releasing the lock is indeed how our code worked in 4.12.1 and earlier.

It happens to work anyway because of the bug mentioned in an earlier comment, where Cache.open() happens to release the lock even if it doesn't own it. And in our use case we should always end up calling Cache.open() before calling Cache.commit().


The only case this can cause an issue is if Debian 9 gets a fix for Cache.open(), which is unlikely since it is almost EOL, or if someone uses a so old apt that it doesn't have this bug, which is impossible since that is too ancient to run ThinLinc anyway.


For reference, here are relevant commits on how another python-apt caller has progressed over the years:

Drop the lock before calling Cache.commit():
https://github.com/mvo5/unattended-upgrades/commit/d4d3f6afdacb72ac290a5f414fd64691926d34ad

Stop unlocking if this is a newer APT:
https://github.com/mvo5/unattended-upgrades/commit/3dbdb853618064f22191a98c5bb5532c6d697ab4

Drop support for the older locking system:
https://github.com/mvo5/unattended-upgrades/commit/f4aa44dc77f6d3a7fee4176ca8d3311366ab6076

Note You need to log in before you can comment on or make changes to this bug.