Bug 7571 - Many of our Python programs run without using the system's locale
Summary: Many of our Python programs run without using the system's locale
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Other (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.13.0
Assignee: Linn
URL:
Keywords: nikle_tester, prosaic
Depends on:
Blocks:
 
Reported: 2020-10-20 10:34 CEST by Linn
Modified: 2021-08-24 17:13 CEST (History)
4 users (show)

See Also:
Acceptance Criteria:


Attachments

Description Linn cendio 2020-10-20 10:34:56 CEST
When reading or writing to files, it is a widely used assumption that the files are encoded in the system's locale. When locale.setlocale() is not being used, Python can't know which locale the system uses.
Comment 1 Linn cendio 2020-10-20 10:41:55 CEST
When running this command, Python's locale will be set to match the system's locale:

> locale.setlocale(locale.LC_ALL, "")

Since it is a widely used practice to assume file encodings match system locale, we have decided that all our Python programs should run the above command.
Comment 2 Pierre Ossman cendio 2020-10-20 10:54:20 CEST
It's not just files, it's basically every string API that's been around for a long while. E.g. user lookups, authentication.

So also note other issues around locale (e.g. bug 5102 and bug 4512).
Comment 9 Samuel Mannehed cendio 2020-11-04 20:04:17 CET
Tested most of the programs mentioned above on a RHEL 8 system. The ones I didn't test was:

* tl-serial-redir - due to bug 6048
* tl-crossover-drives - it was removed in bug 7573
* tl-mount-cifs - due to bug 4614
* tl-umount-all-cifs - due to bug 4614
* tl-ldap-certalias - due to bug 7330

I discovered bug 7581 when testing tl-limit-printers.
Comment 10 Frida Flodin cendio 2020-11-09 10:17:05 CET
The printing backends thinlocal/nearest gives traceback on Ubuntu20.04

I get this in /var/log/cups/error_log for both thinlocal and nearest:

> D [09/Nov/2020:09:06:33 +0000] [Job 33] Traceback (most recent call last):
> D [09/Nov/2020:09:06:33 +0000] [Job 33] File \"/usr/lib/cups/backend/thinlocal\", line 190, in <module>
> D [09/Nov/2020:09:06:33 +0000] [Job 33] sys . exit ( Iiii111Ii11I1 ( ) )
> D [09/Nov/2020:09:06:33 +0000] [Job 33] File \"/usr/lib/cups/backend/thinlocal\", line 26, in Iiii111Ii11I1
> D [09/Nov/2020:09:06:33 +0000] [Job 33] locale . setlocale ( locale . LC_ALL , \"\" )
> D [09/Nov/2020:09:06:33 +0000] [Job 33] File \"/usr/lib/python2.7/locale.py\", line 581, in setlocale
> D [09/Nov/2020:09:06:33 +0000] [Job 33] return _setlocale(category, locale)
> D [09/Nov/2020:09:06:33 +0000] [Job 33] locale.Error: unsupported locale setting
> D [09/Nov/2020:09:06:33 +0000] [Job 33] PID 4993 (/usr/lib/cups/backend/thinlocal) stopped with status 1.
> I [09/Nov/2020:09:06:33 +0000] [Job 33] Backend returned status 1 (failed)


After some investigations I find that the locale they try to set is ANSI_X3.4-1968, which seems to not be available by default on Ubuntu20.04.

When running 'locale -a' I only get:
> C
> C.UTF-8
> POSIX
Comment 11 Frida Flodin cendio 2020-11-10 13:58:37 CET
It seems like CUPS changes the LANG environment variable depending on the environment of the job-starter. Not only that, CUPS also modifies the LANG variable to always be something like <language code>.UTF-8. What I did to confirm this:

* Modified /opt/thinlinc/share/cups/backend/thinlocal to print os.environ['LANG'] to stderr before setlocale.

* Activated my changes and restarted CUPS:
> sudo /opt/thinlinc/libexec/add_thinlocal_printer.sh

* Started a printer job:
> LANG='sv_SE.ISO-8859-1' lpr -P thinlocal -l test_pdf.pdf

* Checked /var/log/cups/error_log and LANG is printed as 'sv_SE.UTF-8'

* This locale is not available on my system and the printing job is not proceeded.



We probably do not need to set locale in thinlocal/nearest. As a solution we should try to setlocale and if we fail we log it as a warning and continue printing anyway.

In general we need to handle unsupported locales elsewhere in the product, see bug 4512.
Comment 14 Niko Lehto cendio 2020-11-13 10:53:53 CET
Tested mostly everything described in comment #6 with nightly build on Ubuntu 20.04. This system was using language sv_SE.UTF-8. Also skipped some testing as described in comment #9:
> * tl-serial-redir - due to bug 6048
> * tl-crossover-drives - it was removed in bug 7573
> * tl-mount-cifs - due to bug 4614
> * tl-umount-all-cifs - due to bug 4614
> * tl-ldap-certalias - due to bug 7330
Had to test tl-ssh-askpass.py on Rhel 8 due Bug 7508.

Also tested the bug described in comment #11 before and after the fix (r35654)
with language set to sv_SE (non UTF-8). Works good.
Comment 15 Samuel Mannehed cendio 2021-08-10 16:38:58 CEST
The fixes made on this bug cause our installer and tl-setup to crash on systems where the locale you're trying to use isn't installed. The crash is the one described in bug 4512, however prior to the changes here, it never happened in the installer or tl-setup.

When you connect using SSH, the default behavior is to bring along the environment variables for the locale. This means it can be quite common to encounter the above scenario when using ssh to connect to a newly installed machine to install the ThinLinc server.

Both a new minimal install of CentOS 8 Stream, and a new minimal install of Alma Linux 8 come without most langpacks installed. Only 'en_US.utf8' is installed by default. Minimal installs of modern Fedoras and Ubuntus are similar.

This is in contrast to a minimal install of CentOS 7 where you get ~700 locales installed by default.

Crashing the first thing we do in the installer will give a bad first impression for people aiming to test ThinLinc. This must be fixed.
Comment 18 Samuel Mannehed cendio 2021-08-12 17:41:07 CEST
The installer, tl-setup, selinux/install and hivetool now gracefully handle the setlocale()-error. This way we should probably not ruin the first impression for a new customer at least.

Out of the 4 programs above, all but selinux/install will print a warning:

> Warning: Can't set the locale (unsupported locale setting); make sure $LC_* and $LANG are correct

The warning includes the error reported by Python to account for future changes on the subject.

Fixing this on all other Python programs will be deferred to bug 4512.

The fixes was tested on a minimal CentOS 7 install (which did not have the locale-problem), and a minimal CentOS 8 Stream (which did have the problem).
Comment 20 Samuel Mannehed cendio 2021-08-18 10:40:51 CEST
This can both be tested through ssh'ing to a system where a locale you're using isn't available, or you can also simply run programs like this:

 LANG=somethingthatdoesntexist tl-4.13.0-server/install-server
Comment 21 Linn cendio 2021-08-18 15:03:02 CEST
Tested server build 2250 on RHEL 8 by ssh:ing from a machine with locale sv_SE.UTF-8, where the locale was not available on the server machine. 

Can confirm that the installer, tl-setup, selinux/install and hivetool no longer crashes, and that all programs except selinux/install outputs a warning.
Comment 22 William Sjöblom cendio 2021-08-23 17:29:07 CEST
A comment on the 4.13.0beta announcement
(https://community.thinlinc.com/t/thinlinc-4-13-0-beta/213/13) highlighted a
problem with the fix described in comment 18.

When running tl-setup with Python 3.6 or lower with an invalid locale, we
will crash during the final step (which breaks the purpose of "gracefully"
handling an invalid locale with a warning). This happens because `systemctl
enable <service>` assumes our invalid locale to use UTF-8 encoding and
decides on using the UTF-8 glyph `→` for pretty printing the symbolic link
created. Since Python 3.6 and below falls back on the `C` locale, and thus
ASCII encoding, when it is given an invalid encoding we get a
`UnicodeDecodeError` with the following stack trace:

> Traceback (most recent call last):
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 308, in <module>
>     oOoO00 ( )
>   File "/opt/thinlinc/sbin/../libexec/tl-setup.py", line 264, in oOoO00
>     I11 = O00oOoo0 . run ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 99, in run
>     return self . _run_text ( )
>   File "/opt/thinlinc/modules/thinlinc/wizard.py", line 117, in _run_text
>     Oo0O0o0oO000 ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/services.py", line 269, in i111II
>     ii1ii = iI1i ( )
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/services.py", line 203, in iI1i
>     if not o0 ( "vsmserver" ) :
>   File "/opt/thinlinc/modules/thinlinc/tlsetup/services.py", line 62, in o0
>     ( i1111IIi , oOo00O ) = I1i . communicate ( )
>   File "/usr/lib64/python3.6/subprocess.py", line 863, in communicate
>     stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/usr/lib64/python3.6/subprocess.py", line 1578, in _communicate
>     self.stderr.errors)
>   File "/usr/lib64/python3.6/subprocess.py", line 760, in _translate_newlines
>     data = data.decode(encoding, errors)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 78: ordinal not in range(128)

What is required for the scenario to happen:
- Python 3.6 or lower (where we do not default to UTF-8 in case of a broken
  locale)
- `LC_*` or `LANG` set to a locale that does not exist on the system
  - Potentially through ssh where the ssh server accepts a broken locale from
    the client (happens on RHEL8 but does not happen Ubuntu 20.04).
  - Potentially through a misconfiguration of the system locale.
- Launching tl-setup through `sudo` magically seems to fix the issue, so it
  seems like we need to start tl-setup from a root shell (with a broken
  locale).

So, to get a better grasp on the severeness of the issue we need to check:
- What mechanisms and versions in the ssh server are responsible for cleaning
  up a broken locale sent from the client?
- Why does launching tl-setup through `sudo` seem to magically resolve the
  issue?

The bug can be reproduced on Fedora 34 by:

> $ sudo systemctl disable vsmserver
> $ sudo dnf install python3.6
> /install the latest rc of thinlinc without running tl-setup/
> /replacing the line `exec -a "$0" /usr/bin/python3 "$@"` with `exec -a "$0" /usr/bin/python3.6 "$@"` in /opt/thinlinc/libexec/python3/
> $ sudo -s
> # LC_ALL="broken" /opt/thinlinc/sbin/tl-setup
Comment 24 Frida Flodin cendio 2021-08-24 15:42:39 CEST
The issue in comment #22 should be fixed now. Tested on Fedora 34 as described in comment #22 and on a normal RHEL 8. When there is no issue with locale the log should look like before, else we replace the characters that can't be decoded with a replacement character.
Comment 25 Linn cendio 2021-08-24 17:10:44 CEST
Tested on RHEL 8 with server build 2253, and can confirm that tl-setup no longer crashes for the scenario described in comment 22. I used the command below for testing:
> LC_ALL=xyz ./tl-4.13.0-server/install-server

As a side note, I could not reproduce the error on SLES 15. It seems to be because of systemd writing slightly different messages when creating a symlink on the different versions ('path1 → path2' vs 'path1 to path2').

Message on RHEL 8, systemd version 239 (using '→'):
> 2021-08-24 14:30:12,525:     Created symlink /etc/systemd/system/multi-user.target.wants/vsmserver.service → /usr/lib/systemd/system/vsmserver.service.

Message on SLES 12, systemd version 228 (using 'to'):
> 2021-08-24 15:37:14,489:     Created symlink from /etc/systemd/system/multi-user.target.wants/vsmserver.service to /usr/lib/systemd/system/vsmserver.service.

Note You need to log in before you can comment on or make changes to this bug.