Bug 7604 - Symlink homedir gives wrong apparent current working directory
Summary: Symlink homedir gives wrong apparent current working directory
Status: NEW
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: VSM Agent (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: LowPrio
Assignee: Bugzilla mail exporter
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-08 16:12 CET by Samuel Mannehed
Modified: 2021-03-09 13:51 CET (History)
0 users

See Also:
Acceptance Criteria:


Attachments

Description Samuel Mannehed cendio 2020-12-08 16:12:58 CET
Let's say you have a user with a homedir that is a symlink like this:

 $ eval echo ~tester
 /local/home/tester

 $ ls -l /local/home/
 lrwxrwxrwx. 1 root   root      7  8 dec 13.18 tester -> /tester

This user will get the real path as working directory when logging in to ThinLinc, instead of the symlink path.

When logging in to a graphical session locally, or with a simple text login (like SSH) we always get the symlink path.

Comparing the 'cwd' symlink in /proc/<PID> shows that it's always set to the real path, no matter if it's a local or thinlinc session.

However, when comparing $PWD in /proc/<PID>/environ we can see that it's different:

 * ThinLinc: Starting with the xsession process $PWD is set to the real path
 * Local: Starting with the gnome-session process $PWD is set to the symlink path
 * SSH: Any child to the bash process shows $PWD set to the symlink path
Comment 1 Samuel Mannehed cendio 2020-12-08 16:18:13 CET
One reason you could have this sort of setup with symlinked-homedirs is if users have $HOME from different file systems but you want to obscure that by gathering them all under /home.
Comment 3 Samuel Mannehed cendio 2020-12-08 16:39:11 CET
It seems that the working directory for a process (/proc/<PID>/cwd) is always the real path, it will never show any symlinks.

This is true for almost all programs, they will all show the correct path, not the "lie" that is the symlink:

 # cd /local/home/tester
 # /usr/bin/pwd
 /tester
 # ls -l /proc/self/cwd
 lrwxrwxrwx. 1 root root 0  8 dec 16.28 /proc/self/cwd -> /tester

 $ python
 Python 3.9.0 (default, Oct  6 2020, 00:00:00) 
 [GCC 10.2.1 20200826 (Red Hat 10.2.1-3)] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import os
 >>> os.getcwd()
 '/tester'



However, shells seem to behave differently, as for example the bash-builtin 'pwd' shows:

 $ type pwd
 pwd is a shell builtin
 $ cd /local/home/tester
 $ pwd
 /local/home/tester

As you can see, this is different from the output from '/usr/bin/pwd'. The bash-builtin 'pwd' gives the same result as we can see in the environment variable $PWD.

The mystery that remains is how the shells manage to keep this "lie" going during local logins but not in ThinLinc.
Comment 4 Samuel Mannehed cendio 2020-12-09 08:55:58 CET
No practical problems aside from user confusion is known at the moment.
Comment 5 Samuel Mannehed cendio 2021-01-18 16:57:33 CET
Interestingly enough it seems the definition of the PWD environment variable has changed over the years:

https://pubs.opengroup.org/onlinepubs/009604599/utilities/xcu_chap02.html#tag_02_05_03

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03

The first link seems to be from 2004, and the second link from 2018. Previously symlinks was not allowed in PWD, but they are now.
Comment 6 Samuel Mannehed cendio 2021-01-18 16:59:01 CET
Since the interesting item here is that PWD "lie", I looked further into how bash sets that environment variable. There are three different scenarios:

 * It's already set

 * If an "interactive login shell" and $HOME != /proc/self/cwd => $PWD = $HOME

   Note that cwd is always the real path. Also note that $HOME is whatever
   is configured for that user, i.e. in this scenario $HOME would "lie" and
   not contain the real path.

 * Otherwise $PWD is set to the result of POSIX "getcwd()"

   Note that "getcwd()" will always give us the "truth" and contain the
   real path.

If bash is behind setting PWD it seems the information above leads to that we are not correctly identified as an "interactive login shell" by bash.
Comment 8 Samuel Mannehed cendio 2021-01-26 11:36:02 CET
Logging in normally using SSH (ssh user@host) will result in a call to the user's shell, as defined in /etc/passwd with the added argument "-". This means bash will identify the shell as an "interactive login shell". This in turn causes the code in bash mentioned above to set PWD to $HOME. The code path was verified using the attached gdb script.

However, logging in using bash as a command in SSH (ssh user@host bash) did not behave as I initially expected. We get the real path, not the symlink.

Debugging using gdb shows that this leads to SSH still starting the user's shell and then bash runs EXEC in that process to whatever command was specified. Bash therefore runs bash again. As seen in comment #6 PWD will be left untouched if already set. And in the first bash call a "-" is not passed when a command is present. That means the first bash call won't be considered a login shell and we will then get the result of "getcwd()" instead, the real path that is.

Bash code:

https://github.com/bminor/bash/blob/76404c85d492c001f59f2644074333ffb7608532/variables.c#L886

SSH code:

https://github.com/openssh/openssh-portable/blob/279261e1ea8150c7c64ab5fe7cb4a4ea17acbb29/session.c#L1708
Comment 9 Samuel Mannehed cendio 2021-01-29 16:45:49 CET
We had a theory that the changes made in bug 5099 for ThinLinc 4.7.0 was the cause of these issues. However, it seems it might not be that easy.

Currently in 4.12.0 the user's shell is started like this:

 /bin/bash -c exec -l "$SHELL" -c "/opt/thinlinc/etc/xsession"

Before ThinLinc 4.7.0 we started it like this:

 /bin/bash --login /opt/thinlinc/etc/xsession

Reverting that change in the SessionStart module doesn't fix the issue. We still incorrectly get the real path when looking at $PWD. One difference was identified however:

 * Using the nightly builds when looking at /proc/<PID>/environ for the
   eventual processes that are part of a ThinLinc session we can see that,
   starting with the xsession process, PWD is set. Not before that point.
   That indicates that PWD is set before xsession is executed.

 * Using the code from before 4.7.0 we can see that PWD is __NOT__ set when
   xsession starts. It does however get set somewhere else along the way,
   because it's children does have PWD set to the real path.

I need to look further into why this behaves differently. Other changes made in bug 5099 are also not investigated so far.
Comment 10 Samuel Mannehed cendio 2021-02-01 09:09:29 CET
Debugging with gdb shows that with both new approach AND the pre-4.7.0 approach we hit the 3rd case described in comment #6. The shell is not considered an "interactive shell" by the bash code.

The process with the current code is like this:

1. vsmagent forks to tl-session
2. tl-session forks to tl-xinit
3. tl-xinit forks to multiple processes, among them is bash
4. bash is called like this '/bin/bash -c exec -l "$SHELL" -c "/opt/thinlinc/etc/xsession"'
5. that bash process will end up in the code described in comment #6 and set PWD = real path
6. the subsequent exec to $SHELL will inherit the environment and will, if set to bash, then see that PWD is already set.

This means it doesn't really matter why the change propogates to the child-processes a step later with the pre-4.7.0 approach. Due to how bash identifies our call we won't get the same behavior as seen with SSH.
Comment 11 Samuel Mannehed cendio 2021-02-01 09:10:46 CET
Adding -l and -i to the call to bash in SessionStart does seem to fix things. We end up with PWD = $HOME and no longer get the real path but the symlink path instead, just like with SSH and local logins. I currently don't know what other effects that has however. One downside is that it probably differs from how GDM does things. Needs more investigation.
Comment 14 Samuel Mannehed cendio 2021-02-04 09:08:08 CET
Debugging GDM with gdb shows that PWD is already set before the very first time bash is called, GDM sets PWD itself.

I found this commit:

https://gitlab.gnome.org/GNOME/gdm/-/commit/10fbdd480bd24b207697934081ef307b641f65f6

This code sets PWD to the same value as the environment variable HOME (which does include symlinks instead of the canonical path). I can verify that this code is used early in the session startup process of GDM on RHEL8. Once the process reaches bash, bash will find that PWD is already set, and not change it.

It is noteworthy however, that this commit is only ~2 years old. It would be interesting to see how things behaved prior to that.
Comment 15 Samuel Mannehed cendio 2021-02-04 13:11:37 CET
I have now tested the behavior of symlink-homedirs in a few additional scenarios:

 * RHEL 7 with GDM 3.28
 * SLES12 with GDM 3.10
 * Ubuntu 20.04 with GDM 3.36
 * Ubuntu 20.04 lightdm 1.30

With GDM 3.28 and 3.36 I can see that the workaround mentioned in comment #14 is at work. That means we see the symlink path instead of the real path.

With lightdm and older GDM (3.10) a local graphical login behaves the same way as ThinLinc does. That means we see the real path instead of the symlink.

The distro doesn't seem to matter.
Comment 16 Samuel Mannehed cendio 2021-02-04 14:25:26 CET
This link is relevant as it describes the relationship between current working directory and PWD:

https://www.gnu.org/software/libc/manual/html_node/Working-Directory.html
Comment 17 Samuel Mannehed cendio 2021-02-04 15:21:07 CET
It seems KDE's SDDM has essentially had code for this since day one:

https://github.com/sddm/sddm/blob/c78da43cf44a1090d8aa1d5ed4a779a3ec42c504/src/helper/Backend.cpp#L67

The consensus here is that it isn't obvious what the correct behavior is. I have asked the GDM people if they can provide some additional info on the subject:

https://gitlab.gnome.org/GNOME/gdm/-/merge_requests/4
Comment 19 Samuel Mannehed cendio 2021-02-11 11:26:31 CET
We're not getting any replies from GDM anymore. That makes it seem that there is no documentation stating that window managers should be setting PWD.

The initial problem that brought this behavior into GDM was the same we have heard, to prevent confusion for the end user:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=179814

My guess is that SDDM looked at how GDM does things and mimic'd them, there is no motivation to the behavior in the SDDM commit:

https://github.com/sddm/sddm/commit/dba8027899e38e58a5672e25290a4d6dfc137dbf

Looking at the current state of things, I'd say the best path forward would be to, similar to how GDM does it, set PWD ourselves in ThinLinc.

Note You need to log in before you can comment on or make changes to this bug.