Bug 7833 - Command line tool for terminating and abandoning sessions
Summary: Command line tool for terminating and abandoning sessions
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Other (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.15.0
Assignee: William Sjöblom
URL:
Keywords: relnotes, tobfa_tester
Depends on: 7855
Blocks: 3707
  Show dependency treegraph
 
Reported: 2022-02-09 10:19 CET by Linn
Modified: 2022-04-20 16:18 CEST (History)
2 users (show)

See Also:
Acceptance Criteria:
1. Based on the output from `tlctl session list', it should be easy to interactively terminate or abandon specific sesssion(s). 2. The user should be prompted for confirmation before terminating or abandoning session(s). 3. Killing and abandoning sessions should be possible when running non-interactively. 4. The user should be given a helpful error message if the termination or abandoning operation was not successful. 5. Accidentally abandoning session(s) instead of terminating them should be made difficult. 6. Common termination tasks should be made easy.


Attachments

Description Linn cendio 2022-02-09 10:19:47 CET
There is no command-line tool for terminating sessions. Currently, you have to do it through the tlwebadm GUI, which can be a bit inconvenient.
Comment 1 William Sjöblom cendio 2022-02-17 13:42:14 CET
Here are a bunch of user stories that I threw together for the subject matter:

• As a system administrator, when contacted by a ThinLinc end-user
  having problems with their session, I want to be able to kill that
  session remotely so that I won't have to guide an inexperienced user
  to perform this themselves.

• As a system administrator that has permanently lost an agent node in
  the cluster, I want to be able to abandon the sessions that used to
  live on that agent node so that the users with session(s) on the lost
  agent are not prompted the next time they connect.

• As a system administrator, I want to be able to kill sessions that has
  been inactive for a time X, so that we do not run into licensing
  limits. (issue25522)

• As a system administrator, I want to be prompted for confirmation when
  performing destructive operations so that I feel less anxious to
  accidentally press <RETURN>.
Comment 3 William Sjöblom cendio 2022-02-21 17:10:11 CET
One thing that we have to decide is how we specify a session to terminate on the command line.

Such an identifier need to have (at least) the following properties:
- No (or negligible) risk of collisions
- Easy to type
- Relatively compact such that it can be printed when listing sessions

For webadm we currently use the string `<agent_fqdn>:<display>` in combination with the username for uniquely identifying sessions in the session database. Ideally, I'd like something more compact but we could use something similar as a starting point.
Comment 4 William Sjöblom cendio 2022-02-21 17:17:18 CET
Another matter worth considering is how we handle the case where the user specifies multiple sessions to be terminated:
> tlctl session terminate <id-1> <id-2>
Where the session <id-1> perfectly resolves to a session in the database, but <id-2> is invalid. 

Do we behave like `rm`, `kill`, et al. and terminate <id-1> while printing an error for <id-2> to stderr, or do we exit early without touching <id-1>?
Comment 6 William Sjöblom cendio 2022-02-23 10:55:57 CET
An initial version has now been committed. This initial version
identifies individual sessions as follows:

`tlctl session terminate <termserv_hostname>:<username>:<display>'

This is more or less the same approach as the one used in webadm where
sessions are looked up by the username in combination with the string
`<termserv_hostname>:<display>'. As usernames (and IPv6 addresses for
that matter) can contain `:', this is maybe not the best approach.

A conscious choice was to avoid parsing each identifier (which
negatively impacts run time, but rids us of potential `:' parsing
difficulties). Nonetheless, we are still in O(n) territory with respect 
to the number of sessions, so this approach sounds fine in this
regard. Regardless, we should make some quantitative measurements to
support this claim if we decide to keep the current implementation.

What I have identified that remains to be addressed:
• Interactive prompt when terminating/abandoning sessions
• Tidy up error handling/messages
• Investigating other potentially better approaches to session
  identifiers
• Investigate the `get_sessions' and `kill_session' race.
  • This also includes investigating how we can determine if a session
    is can be terminated (or can only be abandoned) since
    `sessinfo['status']' may not be up to date.
• Decide on how we should differentiate killing and abandoning sessions
  • Do we use different subcommands?
  • Do we use an `abandon' flag to the `terminate' subcommand?
Comment 26 William Sjöblom cendio 2022-03-08 09:07:12 CET
Another user story that popped up:

As a system administrator, I want to be able to terminate all sessions in the entire cluster so that I can perform system maintenance that would interfere with active sessions.
Comment 35 William Sjöblom cendio 2022-03-09 14:25:09 CET
I encountered bug 7855 which breaks username-constrained terminating of sessions.  We will need to fix this bug first unless we want to introduce nasty workarounds.
Comment 45 William Sjöblom cendio 2022-03-11 12:59:41 CET
I now consider the `tlctl session terminate' subcommand ready for
testing. I have tested the following functionality myself using
server build #2503 running on RHEL8:

• [5/5] Terminate/Abandon
  • ☑ Constrained by display
  • ☑ Constrained by username
  • ☑ Constrained by agent (partially, see [1])
  • ☑ Individual session
  • ☑ All sessions

Some things worth considering for the tester:
• Is the overall user experience good enough?
• [1] Do things work as expected in clusters consisting of more than
  one agent (potentially with varying up/down status)?
• And, as always, are the acceptance criteria fulfilled, and are there
  any other flaws that I haven't considered?


At one point there will also be some form of argument validation
added. So let's wait until then before marking this bug as resolved.
Comment 46 William Sjöblom cendio 2022-03-11 13:16:01 CET
(In reply to William Sjöblom from comment #6)
> An initial version has now been committed. This initial version
> identifies individual sessions as follows:
> 
> `tlctl session terminate <termserv_hostname>:<username>:<display>'

Worth noting is that this has changed substantially since the initial version was committed. Instead of using session identifiers we now use the same constraint/filtering flags as `tlctl session list' for selecting which sessions are to be terminated. This was done to avoid the rather daunting task of properly implementing session identifiers.
Comment 48 Linn cendio 2022-03-14 09:00:33 CET
We now validate input usernames to see if they exist when 0 sessions are found. Tested this and the warning about input not being a valid user is correctly printed.

This was the last thing to be checked, setting to resolved.
Comment 49 Linn cendio 2022-03-22 09:26:49 CET
Bug 425 handling the listing of sessions now has a subcluster flag used for listing all sessions in a subcluster. We should add the same flag to this subcommand to keep the symmetry between list and terminate.

The texts in --help and the TAG should also be updated.
Comment 53 Linn cendio 2022-03-23 15:36:17 CET
Tested using the subcluster flag to terminate sessions in a subcluster, works well. Also checked the texts in --help and man page/TAG, looks ok.
Comment 54 William Sjöblom cendio 2022-03-25 09:44:26 CET
While testing bug 425 I found a couple of issues in code that is shared between `tlctl session list' and `terminate'. So for the future tester, have a look at bug 425, comment 78.
Comment 55 Tobias cendio 2022-03-29 10:52:37 CEST
Test specs
==========
Server build thinlinc-server-4.14.0post-2538.x86_64 installed on RHEL8.

Acceptance criterias
====================

1. Based on the output from `tlctl session list', it should be easy to
interactively terminate or abandon specific session(s).

The list module prints users, displays, agents, status, and age. In the
termination module, one has the option to terminate by users, displays, agents,
and subclusters.

* Very wide column elements are truncated, much like other tlctl submodules that
  truncate the output, rendering copying of such identifiers to the command line
  impossible (in this case user and agent names are in trouble). This issue is
  dealt with in Bug 7848 which proposes a script-friendly version of the output.
* It feels like option to terminate sessions >AGE is missing, based on age
  information from session list.
* Subcluster info missing from session list, forcing the admin to look at the
  web ui to acquire subcluster population. Also no subcluster info in
  termination summary.
* Unable to parse multiple arguments, for instance --user USER1 USER2 or "USER1
  USER2". Does not return any misuse information. Touched upon in comment #4. On
  the other hand, user stories does not convey a high priority.

2. The user should be prompted for confirmation before terminating or abandoning
session(s).

The user is prompted confirmation with default being No.

3. Killing and abandoning sessions should be possible when running
non-interactively.

The -y (--assume-yes) flag parses a yes-answer to all prompts, enabling
non-interactive scripted termination.

4. The user should be given a helpful error message if the termination or
abandoning operation was not successful.

When sessions are unreachable, termination is not conducted, but references
--abandon instead, although not calling it without the admin explicitly doing
so.

5. Accidentally abandoning session(s) instead of terminating them should be made
difficult.

Abandoning sessions translates into termination if they are reachable, although
no information as to why the redirection took place is given. However,
termination call does not conversely translate into abandon if sessions are
unreachable, but must be called explicitly.

6. Common termination tasks should be made easy.

Simple termination tasks based on identifiers agent, user, display, and
subcluster is perfectly possible. Although filtered multiple termination is not
implemented, for instance --user="USER1 USER2"

Summary of suggestions
======================

1) User input case insensitivity.

2) Wording in Optionparser help page:
   --all           terminate all sessions on all agents without constraints
-> --all           terminate all sessions

   --agent=AGENT   constrain to sessions residing on AGENT
-> --agent=AGENT   limit to sessions residing on AGENT

The one mentioned for agent applies to user, display, and subcluster.

3) Multiple arguments as in --user="USER1 USER2".

4) Option to terminate by session age.

5) Subcluster information -- although not primarily an issue for session
terminate but rather for session list, in that no subcluster information is
found so that termination limited to subcluster becomes impractical. The session
termination summary could include a column SUBCLUSTER.

6) Abandon converted to standard termination -- although clearly printed at the
end of the termination summary what is about to happen combined with the yes/no
prompt, perhaps a short message as to why not abandoning of sessions was
redirected to termination. Potentially too much information and the admin is
expected to be aware of this.
Comment 56 William Sjöblom cendio 2022-03-29 16:02:27 CEST
(In reply to Tobias from comment #55)

1. User input case insensitivity.

   The place where this would be reasonable would be for the `--agent'
   option as all other filter options are either case sensitive or
   numeric. Since we do this filtering in the session store on the
   server-side I feel a bit cautious about such a change as it has the
   potential to require quite a lot of testing. I'll take a look and see
   if it looks viable.

2. Wording in Optionparser help page

   Regarding the `--all' description, I wanted to make it absolutely
   clear that this is a potentially very destructive flag that should be
   used with caution. Did you see any other issues with this description
   other than its wordiness?

   I think "constraint" is the most adequate wording here. A
   "constraint" generally refers to something that is externally enforced
   (as is the case here), while a "limitation" tends to refer to
   something that is less externally enforced and more of a natural
   feature of the system.

3. Multiple arguments as in –user="USER1 USER2".

   This could be a possible extension, potentially using
   `action="append"' in the `add_option' call. However, such a syntax
   could potentially cause some confusion:

>  ┌────
>  │ $ tlctl session terminate --user=foo --subcluster=baz --user=bar
>  └────

   What I'm hinting at here is that it is potentially not trivial to
   understand that the above would expand to the following boolean
   query:

   `(user=foo OR user=bar) AND (subcluster=baz)' instead of `(user=foo
     AND subcluster=baz AND user=bar)'

   This may be a drawback of this approach. Another less confusing
   approach may be to use a delimited string containing multiple
   usernames. But in that case, we run into the issue of finding a
   suitable delimiter since Unix usernames can, at least in theory,
   contain basically anything. Still, as mentioned, we currently have no
   clear user stories backing this feature.

4. Option to terminate by session age.

   We are yet to find a user story backing this use case. On top of this
   it also introduces some parsing effort, so unless there is a clear
   and common use-case I would suggest leaving this out for now.

5. Subcluster information – although not primarily an issue for session
   terminate but rather for session list, in that no subcluster
   information is found so that termination limited to subcluster
   becomes impractical. The session termination summary could include a
   column SUBCLUSTER.

   This is more of a conversation to be brought to `tlctl session list'
   (bug 425). The primary use for the termination summary is to uniquely
   identify each session to be terminated, for which AGENT, USER, and
   DISPLAY is sufficient.

6. Abandon converted to standard termination – although clearly printed
   at the end of the termination summary what is about to happen
   combined with the yes/no prompt, perhaps a short message as to why
   not abandoning of sessions was redirected to termination. Potentially
   too much information and the admin is expected to be aware of this.

   Can this instead be viewed as a problem with the wording for the
   `--abandon' flag? Wouldn't renaming it to `--allow-abandon' remove
   this confusion?
Comment 59 William Sjöblom cendio 2022-03-31 09:41:55 CEST
(In reply to William Sjöblom from comment #56)
> (In reply to Tobias from comment #55)
> 
> 1. User input case insensitivity.
> 
>    The place where this would be reasonable would be for the `--agent'
>    option as all other filter options are either case sensitive or
>    numeric. Since we do this filtering in the session store on the
>    server-side I feel a bit cautious about such a change as it has the
>    potential to require quite a lot of testing. I'll take a look and see
>    if it looks viable.
I have investigated this and from what I can tell from `tlctl load' the only place where we are case-insensitive is for `tlctl load list --sort=<case-insensitive column name>'. Having case-insensitive column names is much less error-prone than case-insensitive usernames (on systems where applicable) and hostnames since case-insensitive comparisons are locale-dependent. I suggest that we break this one out into a separate bug covering case-insensitivity for tlctl as a whole.
 
> 6. Abandon converted to standard termination – although clearly printed
>    at the end of the termination summary what is about to happen
>    combined with the yes/no prompt, perhaps a short message as to why
>    not abandoning of sessions was redirected to termination. Potentially
>    too much information and the admin is expected to be aware of this.
> 
>    Can this instead be viewed as a problem with the wording for the
>    `--abandon' flag? Wouldn't renaming it to `--allow-abandon' remove
>    this confusion?
I have made some improvements to the `tlctl-session' man page, `tlctl session terminate' prints, and renamed `--abandon' to `--allow-abandon' to rectify these concerns.
Comment 60 William Sjöblom cendio 2022-03-31 09:44:01 CEST
Note that the testing of bug 425 uncovered a couple of locale-related issues that are also present in `terminate'. So we should keep this in mind before marking this bug as resolved and during testing.
Comment 61 William Sjöblom cendio 2022-03-31 10:27:46 CEST
(In reply to William Sjöblom from comment #59)
I suggest that we break this one out into a separate bug
> covering case-insensitivity for tlctl as a whole.

Done! See bug 7876.
Comment 63 Linn cendio 2022-04-05 10:53:39 CEST
The locale-related issues in bug 425 mentioned in comment 60 have now been solved. The issues could be seen in terminate when listing the sessions to terminate with C locale. 

Description of issues, copied from bug 425 comment 77:

> • Running `LC_ALL=C tlctl session list' with a non-ASCII username in the
>   session store will result in the following output/stacktrace:
> > ┌────
> > │ USER    DISPLAY  AGENT    STATUS     AGE
> > │ =============================================
> > │ Traceback (most recent call last):
> > │   File "/opt/thinlinc/sbin/tlctl", line 53, in <module>
> > │     IIiIiiIiI ( )
> > │   File "/opt/thinlinc/sbin/tlctl", line 49, in IIiIiiIiI
> > │     I1i [ i1111IIi ] . execute ( ii1IiIiiII , I1I111i11I [ 1 : ] , Iio0 . get_subparser ( i1111IIi ) )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 278, in execute
> > │     iIIIII1i111i ( options , args [ 1 : ] , IIII1iII11ii )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 48, in iIIIII1i111i
> > │     tlctl . print_as_table ( oO0 , ii1IiIiiII )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/__init__.py", line 220, in print_as_table
> > │     print ( IIiIIIII1iiI . format ( * OO0OOOoOOooO ) )
> > │ UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 0: ordinal not in range(128)
> > └────
>
> • Running `LC_ALL=C tlctl session list --user=öl' (i.e. filtering by
>   non-ASCII value on an ASCII system) will result in the following
>   stacktrace:
> > ┌────
> > │ Traceback (most recent call last):
> > │   File "/opt/thinlinc/sbin/tlctl", line 53, in <module>
> > │     IIiIiiIiI ( )
> > │   File "/opt/thinlinc/sbin/tlctl", line 49, in IIiIiiIiI
> > │     I1i [ i1111IIi ] . execute ( ii1IiIiiII , I1I111i11I [ 1 : ] , Iio0 . get_subparser ( i1111IIi ) )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 278, in execute
> > │     iIIIII1i111i ( options , args [ 1 : ] , IIII1iII11ii )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 35, in iIIIII1i111i
> > │     ooo0O0oO00 = i1ii1 ( verify = False , filters = iiIII )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/session.py", line 18, in i1ii1
> > │     ooo0O0oO00 = tlctl . master_rpc ( "get_sessions" , verify , filters )
> > │   File "/opt/thinlinc/modules/thinlinc/tlctl/__init__.py", line 39, in master_rpc
> > │     IIII11 = getattr ( i11IiIIiii1II , method ) ( * args )
> > │   File "/usr/lib64/python3.4/xmlrpc/client.py", line 1105, in __call__
> > │     return self.__send(self.__name, args)
> > │   File "/usr/lib64/python3.4/xmlrpc/client.py", line 1439, in __request
> > │     allow_none=self.__allow_none).encode(self.__encoding)
> > │ UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 20

These were tested on RHEL 8. We will now replace unprintable characters with '?' when having non-ascii names in the session store, and abort and print an error message if we get input in an unknown encoding.

Release notes were added earlier for all tlctl subcommands in commit r37928.


With that being the last things to be checked, this bug is now ready for testing.
Comment 66 Tobias cendio 2022-04-20 16:18:55 CEST
Tested on Ubuntu 20.04 and RHEL8 using developers build 4.14.0post-2586. The following points have been verified:

[x] Terminate using information deduced from session list.
[x] tlctl session list does not explicitly print subclusters, although filtering w.r.t. subcluster is possible. Thus our subcluster knowledge can be applied when terminating sessions constrained to a subcluster.
[x] Well-formulated optparse help.
[x] Non-interactive termination using --assume-yes.
[x] Non-interactive termination with --allow-abandon using --assume-yes.
[x] Contraints --all, --agent, --user, --display, and --subcluster seem to be working as intended (note that sessions on agents in multiple subclusters may be terminated using either subcluster label).
[x] The above point also applies for abandon.

[x] The flag --abandon has been renamed --allow-abandon which feels more appropriate since --abandon gave the impression of forced abandon.
[x] Improved explanation in docs what abandoning sessions imply.
[x] Improved terminal message how to deal with unreachable sessions.
[x] Summary is helpful both terminate and abandon.

[x] Using tlctl in ascii env generates ?? when printing non-ascii characters.
[x] Using tlctl in ascii env prints helpful error msg when called with non-ascii input.

Note You need to log in before you can comment on or make changes to this bug.