Bug 7800 - congestion control misbehaves with very large frame updates
Summary: congestion control misbehaves with very large frame updates
Status: NEW
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: VNC (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: LowPrio
Assignee: Bugzilla mail exporter
URL:
Keywords:
Depends on: 4735
Blocks: performance
  Show dependency treegraph
 
Reported: 2021-11-29 17:23 CET by Pierre Ossman
Modified: 2023-02-14 14:14 CET (History)
0 users

See Also:
Acceptance Criteria:


Attachments

Description Pierre Ossman cendio 2021-11-29 17:23:04 CET
This is a continuation of bug 4735. If you massively overcongest the wire then we still see scenarios where the congestion control fails miserably and the RTT to the server increases massively.

The scenario happens when the updates are much larger than the BDP, which is easy to simulate by having a low latency and a low bandwidth. As an example we've tested 50 ms RTT and 128 kbps bandwidth. We have not explored any realistic scenarios at this point.

The problem seems to be an overly aggressive idle timeout in the congestion handle. TCP closes the congestion window once more than RTO time has passed since the last transmission. This is generally 2×RTT. We try to do the same, however our approximation is probably too simplistic.

The major flaw is that we only measure the time since our last transmission. This is not the same as the last transmission by TCP. We will likely have some buffering, and in the failing scenario this buffering is massive. So we need to make a guess as to when things left an uncongested state, and measure one RTO from there.

TCP also never sets RTO below 1 second, but we've set the limit to 100ms. Not sure if that was an attempt at a cautionary value.

Lastly, TCP sets RTO based on current RTT (and its variation), whilst we use the minimum RTT instead.
Comment 1 Pierre Ossman cendio 2021-11-29 17:28:15 CET
If you enable congestion control logging you can see this from Xvnc:

> Mon Nov 29 16:25:44 2021
>  Congestion:  Connection idle for 424 ms, resetting congestion control
> 
> Mon Nov 29 16:25:50 2021
>  Congestion:  Connection idle for 6334 ms, resetting congestion control
> 
> Mon Nov 29 16:25:52 2021
>  Congestion:  Connection idle for 492 ms, resetting congestion control
> 
> Mon Nov 29 16:26:05 2021
>  Congestion:  Connection idle for 565 ms, resetting congestion control

Since there is a steady flow of data this should not happen. There is always data ready to be sent, i.e. we are never truly idle.

Also note that it rarely manages to get a proper measurement in place.
Comment 2 Pierre Ossman cendio 2023-02-14 14:14:04 CET
There is some discussion about this upstream:

https://github.com/TigerVNC/tigervnc/commit/a99d14d1939cb2338b6268d9aebe3850df66daed#r57748408

Note You need to log in before you can comment on or make changes to this bug.