Bug 8566 - Load is unfair if agents are unevenly matched
Summary: Load is unfair if agents are unevenly matched
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: VSM Server (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.19.0
Assignee: Samuel Mannehed
URL:
Keywords: aleze_tester, relnotes
Depends on:
Blocks: 280
  Show dependency treegraph
 
Reported: 2025-04-03 14:38 CEST by Samuel Mannehed
Modified: 2025-04-24 10:02 CEST (History)
1 user (show)

See Also:
Acceptance Criteria:
MUST * It must be possible to configure different weights for different agents. * Any new configuration must be documented. SHOULD * Web Admin and tlctl should show information about differently weighted agents. * The weight configuration should be optional and shouldn't bother users who don't need it. * The weight configuration should be easy to understand at a glance. COULD * Information about differently weighted agents could only be presented if there is any weight difference in the cluster.


Attachments

Description Samuel Mannehed cendio 2025-04-03 14:38:37 CEST
With the new load balancer (bug 4429) users will be evenly distributed on each agent. This is fine as long as the agents are somewhat similar resource-wise. But if one agent has half the resources of another, it is not ideal that they both could have, lets say 40 user sessions.

In some scenarios the sysadmin might want to distribute the users in a different way.
Comment 1 Samuel Mannehed cendio 2025-04-03 14:41:46 CEST
Note that the following is currently written in the "Load balancing" section of the documentation:
> The resulting user distribution works best if the agents have similar hardware resources.
This is a reminder to update the TAG depending on the solution chosen here.
Comment 2 Samuel Mannehed cendio 2025-04-03 14:51:32 CEST
One idea that was discussed internally was to add the possibility of specific agent weights. This would fit well into cluster.hconf, and the hostname of the agent could be the parameter-key:
> [/agents/weights]
> agent1.thinlinc.com = 50
> agent2.thinlinc.com = 200
By default, and if not specified, the weight could be 100. This number is hopefully easy to work with.
Comment 3 Samuel Mannehed cendio 2025-04-15 14:38:16 CEST
There was some internal discussion about the name "weights" and how people expect it to work.

- Some of us expected it to refer to the weight of users, and thus a higher
  weight would result in fewer users on that agent.
- But the majority expected it to refer to the weight of the agent, and that a
  higher weight would result in more users on that agent.

I looked at some other products that included a load balancer and found several examples that used the word "weight". In all of the examples I found, a higher weight resulted in the server having to handle a higher load.

The examples I found were:

- VMware
  https://www.vmware.com/topics/round-robin-load-balancing
- Windows 2025 Server NLB
  https://learn.microsoft.com/en-us/powershell/module/networkloadbalancingclusters/set-nlbclusterportrulenodeweight?view=windowsserver2025-ps
- Oracle GlassFish
  https://docs.oracle.com/html/E24938_01/configure-lb-weight.htm
- Google Cloud
  https://cloud.google.com/load-balancing/docs/network/configure-weighted-netlb

Given the above reasoning, we will move ahead with the term "weights" and a higher weight will result in a higher load.
Comment 18 Samuel Mannehed cendio 2025-04-16 10:55:24 CEST
This should be done now! Tested build 3998 on a cluster with 3 CentOS 8 machines.

> MUST
> 
> * It must be possible to configure different weights for different agents.
Yes, this is done under /agents/weights/<hostname>, found in cluster.hconf.

> * Any new configuration must be documented.
It is.

> SHOULD
> 
> * Web Admin and tlctl should show information about differently weighted agents.
Yes they do, given that a weight is different from the defaults.

> * The weight configuration should be optional and shouldn't bother users who don't need it.
If an agent isn't specified under /agents/weights/, it will get the default weight of 100.

> * The weight configuration should be easy to understand at a glance.
It should be, we use the same type of naming and "direction" of values as other products, as described in comment #3.

> COULD
> 
> * Information about differently weighted agents could only be presented if there is any weight difference in the cluster.
If the weight of any agent differs from the default (100), a new column will show the weight of all agents.
Comment 29 Alexander Zeijlon cendio 2025-04-23 16:25:57 CEST
Testing:
========

I used three RHEL 9 VMs where one is acting as a master and agent and the other two acting as agents.

On all agents, I also created 28 users, since this is a tenth of the number of users in the examples in the TAG. The license limit was set well above the number of created users.

In each scenario below, I started with an empty cluster and logged in with all created users and observed the load balancing behavior.

Scenarios:
==========

* All agents have weight unset:

    The users are distributed evenly on the agents.


* All agents have invalid weights set (-100, 0, asd):

    The users are distributed evenly on the agents.


* All agents have the same (non-default) weight set:

    The users are distributed evenly on the agents.


* Agents have the weights 50, 100 and 200:

    Users are distributed according to the weights -- 4, 8 and 16 respectively.


* Agents have the weights 50 and 200, and the last agent is draining:

    Users are distributed according to the weights of the active agents -- 5
    and 23 respectively.


* One agent with weight 50 in one subcluster, and two agents with weight 100
  in another subcluster. 14 users were assigned to each subcluster:

    Users are distributed according to the weights of the agents per subcluster,
    The single agent gets all 14 of its users, and the cluster with two agents
    distributes its users evenly.


* Same setup as the previous point, but there is an overlap where one agent with
  weight 100 is in both subclusters, --> [[50, [ 100 ]], 100].

    Here, the distribution of users depends on the sequence in which users
    belonging to eighter of the two clusters, log in. I tested three scenarios.

    Users log in to the first cluster first --  [[4, [ 12 ]], 12]
    Users log in to the second cluster first -- [[7, [ 14 ]],  7]
    Users alternate between the two clusters -- [[6. [ 11 ]]. 11]

    This may be a bit of an edge case, but the first two scenarios show that it
    is possible to get a user distribution that doesn't match the configured
    weights. But at the same time, two out of the three scenarios above trend
    towards the configured weights.
Comment 33 Alexander Zeijlon cendio 2025-04-24 10:02:46 CEST
I looked through the code, and comments made on commits have been addressed.

Closing!

Note You need to log in before you can comment on or make changes to this bug.