Summary: | Command line tool for terminating and abandoning sessions | ||
---|---|---|---|
Product: | ThinLinc | Reporter: | Linn <linma> |
Component: | Other | Assignee: | William Sjöblom <wilsj> |
Status: | CLOSED FIXED | ||
Severity: | Normal | CC: | tobfa, wilsj |
Priority: | P2 | Keywords: | relnotes, tobfa_tester |
Version: | trunk | ||
Target Milestone: | 4.15.0 | ||
Hardware: | PC | ||
OS: | Unknown | ||
See Also: |
https://www.cendio.com/bugzilla/show_bug.cgi?id=425 https://www.cendio.com/bugzilla/show_bug.cgi?id=7834 |
||
Acceptance Criteria: |
1. Based on the output from `tlctl session list', it should be easy to interactively terminate or abandon specific sesssion(s).
2. The user should be prompted for confirmation before terminating or abandoning session(s).
3. Killing and abandoning sessions should be possible when running non-interactively.
4. The user should be given a helpful error message if the termination or abandoning operation was not successful.
5. Accidentally abandoning session(s) instead of terminating them should be made difficult.
6. Common termination tasks should be made easy.
|
||
Bug Depends on: | 7855 | ||
Bug Blocks: | 3707 |
Description
Linn
Here are a bunch of user stories that I threw together for the subject matter: • As a system administrator, when contacted by a ThinLinc end-user having problems with their session, I want to be able to kill that session remotely so that I won't have to guide an inexperienced user to perform this themselves. • As a system administrator that has permanently lost an agent node in the cluster, I want to be able to abandon the sessions that used to live on that agent node so that the users with session(s) on the lost agent are not prompted the next time they connect. • As a system administrator, I want to be able to kill sessions that has been inactive for a time X, so that we do not run into licensing limits. (issue25522) • As a system administrator, I want to be prompted for confirmation when performing destructive operations so that I feel less anxious to accidentally press <RETURN>. One thing that we have to decide is how we specify a session to terminate on the command line. Such an identifier need to have (at least) the following properties: - No (or negligible) risk of collisions - Easy to type - Relatively compact such that it can be printed when listing sessions For webadm we currently use the string `<agent_fqdn>:<display>` in combination with the username for uniquely identifying sessions in the session database. Ideally, I'd like something more compact but we could use something similar as a starting point. Another matter worth considering is how we handle the case where the user specifies multiple sessions to be terminated:
> tlctl session terminate <id-1> <id-2>
Where the session <id-1> perfectly resolves to a session in the database, but <id-2> is invalid.
Do we behave like `rm`, `kill`, et al. and terminate <id-1> while printing an error for <id-2> to stderr, or do we exit early without touching <id-1>?
An initial version has now been committed. This initial version identifies individual sessions as follows: `tlctl session terminate <termserv_hostname>:<username>:<display>' This is more or less the same approach as the one used in webadm where sessions are looked up by the username in combination with the string `<termserv_hostname>:<display>'. As usernames (and IPv6 addresses for that matter) can contain `:', this is maybe not the best approach. A conscious choice was to avoid parsing each identifier (which negatively impacts run time, but rids us of potential `:' parsing difficulties). Nonetheless, we are still in O(n) territory with respect to the number of sessions, so this approach sounds fine in this regard. Regardless, we should make some quantitative measurements to support this claim if we decide to keep the current implementation. What I have identified that remains to be addressed: • Interactive prompt when terminating/abandoning sessions • Tidy up error handling/messages • Investigating other potentially better approaches to session identifiers • Investigate the `get_sessions' and `kill_session' race. • This also includes investigating how we can determine if a session is can be terminated (or can only be abandoned) since `sessinfo['status']' may not be up to date. • Decide on how we should differentiate killing and abandoning sessions • Do we use different subcommands? • Do we use an `abandon' flag to the `terminate' subcommand? Another user story that popped up: As a system administrator, I want to be able to terminate all sessions in the entire cluster so that I can perform system maintenance that would interfere with active sessions. I encountered bug 7855 which breaks username-constrained terminating of sessions. We will need to fix this bug first unless we want to introduce nasty workarounds. I now consider the `tlctl session terminate' subcommand ready for testing. I have tested the following functionality myself using server build #2503 running on RHEL8: • [5/5] Terminate/Abandon • ☑ Constrained by display • ☑ Constrained by username • ☑ Constrained by agent (partially, see [1]) • ☑ Individual session • ☑ All sessions Some things worth considering for the tester: • Is the overall user experience good enough? • [1] Do things work as expected in clusters consisting of more than one agent (potentially with varying up/down status)? • And, as always, are the acceptance criteria fulfilled, and are there any other flaws that I haven't considered? At one point there will also be some form of argument validation added. So let's wait until then before marking this bug as resolved. (In reply to William Sjöblom from comment #6) > An initial version has now been committed. This initial version > identifies individual sessions as follows: > > `tlctl session terminate <termserv_hostname>:<username>:<display>' Worth noting is that this has changed substantially since the initial version was committed. Instead of using session identifiers we now use the same constraint/filtering flags as `tlctl session list' for selecting which sessions are to be terminated. This was done to avoid the rather daunting task of properly implementing session identifiers. We now validate input usernames to see if they exist when 0 sessions are found. Tested this and the warning about input not being a valid user is correctly printed. This was the last thing to be checked, setting to resolved. Bug 425 handling the listing of sessions now has a subcluster flag used for listing all sessions in a subcluster. We should add the same flag to this subcommand to keep the symmetry between list and terminate. The texts in --help and the TAG should also be updated. Tested using the subcluster flag to terminate sessions in a subcluster, works well. Also checked the texts in --help and man page/TAG, looks ok. While testing bug 425 I found a couple of issues in code that is shared between `tlctl session list' and `terminate'. So for the future tester, have a look at bug 425, comment 78. Test specs ========== Server build thinlinc-server-4.14.0post-2538.x86_64 installed on RHEL8. Acceptance criterias ==================== 1. Based on the output from `tlctl session list', it should be easy to interactively terminate or abandon specific session(s). The list module prints users, displays, agents, status, and age. In the termination module, one has the option to terminate by users, displays, agents, and subclusters. * Very wide column elements are truncated, much like other tlctl submodules that truncate the output, rendering copying of such identifiers to the command line impossible (in this case user and agent names are in trouble). This issue is dealt with in Bug 7848 which proposes a script-friendly version of the output. * It feels like option to terminate sessions >AGE is missing, based on age information from session list. * Subcluster info missing from session list, forcing the admin to look at the web ui to acquire subcluster population. Also no subcluster info in termination summary. * Unable to parse multiple arguments, for instance --user USER1 USER2 or "USER1 USER2". Does not return any misuse information. Touched upon in comment #4. On the other hand, user stories does not convey a high priority. 2. The user should be prompted for confirmation before terminating or abandoning session(s). The user is prompted confirmation with default being No. 3. Killing and abandoning sessions should be possible when running non-interactively. The -y (--assume-yes) flag parses a yes-answer to all prompts, enabling non-interactive scripted termination. 4. The user should be given a helpful error message if the termination or abandoning operation was not successful. When sessions are unreachable, termination is not conducted, but references --abandon instead, although not calling it without the admin explicitly doing so. 5. Accidentally abandoning session(s) instead of terminating them should be made difficult. Abandoning sessions translates into termination if they are reachable, although no information as to why the redirection took place is given. However, termination call does not conversely translate into abandon if sessions are unreachable, but must be called explicitly. 6. Common termination tasks should be made easy. Simple termination tasks based on identifiers agent, user, display, and subcluster is perfectly possible. Although filtered multiple termination is not implemented, for instance --user="USER1 USER2" Summary of suggestions ====================== 1) User input case insensitivity. 2) Wording in Optionparser help page: --all terminate all sessions on all agents without constraints -> --all terminate all sessions --agent=AGENT constrain to sessions residing on AGENT -> --agent=AGENT limit to sessions residing on AGENT The one mentioned for agent applies to user, display, and subcluster. 3) Multiple arguments as in --user="USER1 USER2". 4) Option to terminate by session age. 5) Subcluster information -- although not primarily an issue for session terminate but rather for session list, in that no subcluster information is found so that termination limited to subcluster becomes impractical. The session termination summary could include a column SUBCLUSTER. 6) Abandon converted to standard termination -- although clearly printed at the end of the termination summary what is about to happen combined with the yes/no prompt, perhaps a short message as to why not abandoning of sessions was redirected to termination. Potentially too much information and the admin is expected to be aware of this. (In reply to Tobias from comment #55) 1. User input case insensitivity. The place where this would be reasonable would be for the `--agent' option as all other filter options are either case sensitive or numeric. Since we do this filtering in the session store on the server-side I feel a bit cautious about such a change as it has the potential to require quite a lot of testing. I'll take a look and see if it looks viable. 2. Wording in Optionparser help page Regarding the `--all' description, I wanted to make it absolutely clear that this is a potentially very destructive flag that should be used with caution. Did you see any other issues with this description other than its wordiness? I think "constraint" is the most adequate wording here. A "constraint" generally refers to something that is externally enforced (as is the case here), while a "limitation" tends to refer to something that is less externally enforced and more of a natural feature of the system. 3. Multiple arguments as in –user="USER1 USER2". This could be a possible extension, potentially using `action="append"' in the `add_option' call. However, such a syntax could potentially cause some confusion: > ┌──── > │ $ tlctl session terminate --user=foo --subcluster=baz --user=bar > └──── What I'm hinting at here is that it is potentially not trivial to understand that the above would expand to the following boolean query: `(user=foo OR user=bar) AND (subcluster=baz)' instead of `(user=foo AND subcluster=baz AND user=bar)' This may be a drawback of this approach. Another less confusing approach may be to use a delimited string containing multiple usernames. But in that case, we run into the issue of finding a suitable delimiter since Unix usernames can, at least in theory, contain basically anything. Still, as mentioned, we currently have no clear user stories backing this feature. 4. Option to terminate by session age. We are yet to find a user story backing this use case. On top of this it also introduces some parsing effort, so unless there is a clear and common use-case I would suggest leaving this out for now. 5. Subcluster information – although not primarily an issue for session terminate but rather for session list, in that no subcluster information is found so that termination limited to subcluster becomes impractical. The session termination summary could include a column SUBCLUSTER. This is more of a conversation to be brought to `tlctl session list' (bug 425). The primary use for the termination summary is to uniquely identify each session to be terminated, for which AGENT, USER, and DISPLAY is sufficient. 6. Abandon converted to standard termination – although clearly printed at the end of the termination summary what is about to happen combined with the yes/no prompt, perhaps a short message as to why not abandoning of sessions was redirected to termination. Potentially too much information and the admin is expected to be aware of this. Can this instead be viewed as a problem with the wording for the `--abandon' flag? Wouldn't renaming it to `--allow-abandon' remove this confusion? (In reply to William Sjöblom from comment #56) > (In reply to Tobias from comment #55) > > 1. User input case insensitivity. > > The place where this would be reasonable would be for the `--agent' > option as all other filter options are either case sensitive or > numeric. Since we do this filtering in the session store on the > server-side I feel a bit cautious about such a change as it has the > potential to require quite a lot of testing. I'll take a look and see > if it looks viable. I have investigated this and from what I can tell from `tlctl load' the only place where we are case-insensitive is for `tlctl load list --sort=<case-insensitive column name>'. Having case-insensitive column names is much less error-prone than case-insensitive usernames (on systems where applicable) and hostnames since case-insensitive comparisons are locale-dependent. I suggest that we break this one out into a separate bug covering case-insensitivity for tlctl as a whole. > 6. Abandon converted to standard termination – although clearly printed > at the end of the termination summary what is about to happen > combined with the yes/no prompt, perhaps a short message as to why > not abandoning of sessions was redirected to termination. Potentially > too much information and the admin is expected to be aware of this. > > Can this instead be viewed as a problem with the wording for the > `--abandon' flag? Wouldn't renaming it to `--allow-abandon' remove > this confusion? I have made some improvements to the `tlctl-session' man page, `tlctl session terminate' prints, and renamed `--abandon' to `--allow-abandon' to rectify these concerns. Note that the testing of bug 425 uncovered a couple of locale-related issues that are also present in `terminate'. So we should keep this in mind before marking this bug as resolved and during testing. (In reply to William Sjöblom from comment #59) I suggest that we break this one out into a separate bug > covering case-insensitivity for tlctl as a whole. Done! See bug 7876. The locale-related issues in bug 425 mentioned in comment 60 have now been solved. The issues could be seen in terminate when listing the sessions to terminate with C locale. Description of issues, copied from bug 425 comment 77: > • Running `LC_ALL=C tlctl session list' with a non-ASCII username in the > session store will result in the following output/stacktrace: > > ┌──── > > │ USER DISPLAY AGENT STATUS AGE > > │ ============================================= > > │ Traceback (most recent call last): > > │ File "/opt/thinlinc/sbin/tlctl", line 53, in <module> > > │ IIiIiiIiI ( ) > > │ File "/opt/thinlinc/sbin/tlctl", line 49, in IIiIiiIiI > > │ I1i [ i1111IIi ] . execute ( ii1IiIiiII , I1I111i11I [ 1 : ] , Iio0 . get_subparser ( i1111IIi ) ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 278, in execute > > │ iIIIII1i111i ( options , args [ 1 : ] , IIII1iII11ii ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 48, in iIIIII1i111i > > │ tlctl . print_as_table ( oO0 , ii1IiIiiII ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/__init__.py", line 220, in print_as_table > > │ print ( IIiIIIII1iiI . format ( * OO0OOOoOOooO ) ) > > │ UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 0: ordinal not in range(128) > > └──── > > • Running `LC_ALL=C tlctl session list --user=öl' (i.e. filtering by > non-ASCII value on an ASCII system) will result in the following > stacktrace: > > ┌──── > > │ Traceback (most recent call last): > > │ File "/opt/thinlinc/sbin/tlctl", line 53, in <module> > > │ IIiIiiIiI ( ) > > │ File "/opt/thinlinc/sbin/tlctl", line 49, in IIiIiiIiI > > │ I1i [ i1111IIi ] . execute ( ii1IiIiiII , I1I111i11I [ 1 : ] , Iio0 . get_subparser ( i1111IIi ) ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 278, in execute > > │ iIIIII1i111i ( options , args [ 1 : ] , IIII1iII11ii ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 35, in iIIIII1i111i > > │ ooo0O0oO00 = i1ii1 ( verify = False , filters = iiIII ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 18, in i1ii1 > > │ ooo0O0oO00 = tlctl . master_rpc ( "get_sessions" , verify , filters ) > > │ File "/opt/thinlinc/modules/thinlinc/tlctl/__init__.py", line 39, in master_rpc > > │ IIII11 = getattr ( i11IiIIiii1II , method ) ( * args ) > > │ File "/usr/lib64/python3.4/xmlrpc/client.py", line 1105, in __call__ > > │ return self.__send(self.__name, args) > > │ File "/usr/lib64/python3.4/xmlrpc/client.py", line 1439, in __request > > │ allow_none=self.__allow_none).encode(self.__encoding) > > │ UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 20 These were tested on RHEL 8. We will now replace unprintable characters with '?' when having non-ascii names in the session store, and abort and print an error message if we get input in an unknown encoding. Release notes were added earlier for all tlctl subcommands in commit r37928. With that being the last things to be checked, this bug is now ready for testing. Tested on Ubuntu 20.04 and RHEL8 using developers build 4.14.0post-2586. The following points have been verified: [x] Terminate using information deduced from session list. [x] tlctl session list does not explicitly print subclusters, although filtering w.r.t. subcluster is possible. Thus our subcluster knowledge can be applied when terminating sessions constrained to a subcluster. [x] Well-formulated optparse help. [x] Non-interactive termination using --assume-yes. [x] Non-interactive termination with --allow-abandon using --assume-yes. [x] Contraints --all, --agent, --user, --display, and --subcluster seem to be working as intended (note that sessions on agents in multiple subclusters may be terminated using either subcluster label). [x] The above point also applies for abandon. [x] The flag --abandon has been renamed --allow-abandon which feels more appropriate since --abandon gave the impression of forced abandon. [x] Improved explanation in docs what abandoning sessions imply. [x] Improved terminal message how to deal with unreachable sessions. [x] Summary is helpful both terminate and abandon. [x] Using tlctl in ascii env generates ?? when printing non-ascii characters. [x] Using tlctl in ascii env prints helpful error msg when called with non-ascii input. |