7800 – congestion control misbehaves with very large frame updates

Bug 7800 - congestion control misbehaves with very large frame updates

Summary: congestion control misbehaves with very large frame updates

Status:	NEW

Alias:	None

Product:	ThinLinc
Classification:	Unclassified
Component:	VNC (show other bugs)
Version:	trunk
Hardware:	PC Unknown

Importance:	P2 Normal
Target Milestone:	LowPrio
Assignee:	Bugzilla mail exporter

URL:
Keywords:

Depends on:	4735
Blocks:	performance
	Show dependency tree / graph

Reported:	2021-11-29 17:23 CET by Pierre Ossman
Modified:	2023-02-14 14:14 CET (History)
CC List:	0 users

See Also:
Acceptance Criteria:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Pierre Ossman cendio

2021-11-29 17:23:04 CET

This is a continuation of bug 4735. If you massively overcongest the wire then we still see scenarios where the congestion control fails miserably and the RTT to the server increases massively.

The scenario happens when the updates are much larger than the BDP, which is easy to simulate by having a low latency and a low bandwidth. As an example we've tested 50 ms RTT and 128 kbps bandwidth. We have not explored any realistic scenarios at this point.

The problem seems to be an overly aggressive idle timeout in the congestion handle. TCP closes the congestion window once more than RTO time has passed since the last transmission. This is generally 2×RTT. We try to do the same, however our approximation is probably too simplistic.

The major flaw is that we only measure the time since our last transmission. This is not the same as the last transmission by TCP. We will likely have some buffering, and in the failing scenario this buffering is massive. So we need to make a guess as to when things left an uncongested state, and measure one RTO from there.

TCP also never sets RTO below 1 second, but we've set the limit to 100ms. Not sure if that was an attempt at a cautionary value.

Lastly, TCP sets RTO based on current RTT (and its variation), whilst we use the minimum RTT instead.

Comment 1 Pierre Ossman cendio

2021-11-29 17:28:15 CET

If you enable congestion control logging you can see this from Xvnc:

> Mon Nov 29 16:25:44 2021
>  Congestion:  Connection idle for 424 ms, resetting congestion control
> 
> Mon Nov 29 16:25:50 2021
>  Congestion:  Connection idle for 6334 ms, resetting congestion control
> 
> Mon Nov 29 16:25:52 2021
>  Congestion:  Connection idle for 492 ms, resetting congestion control
> 
> Mon Nov 29 16:26:05 2021
>  Congestion:  Connection idle for 565 ms, resetting congestion control

Since there is a steady flow of data this should not happen. There is always data ready to be sent, i.e. we are never truly idle.

Also note that it rarely manages to get a proper measurement in place.

Comment 2 Pierre Ossman cendio

2023-02-14 14:14:04 CET

There is some discussion about this upstream:

https://github.com/TigerVNC/tigervnc/commit/a99d14d1939cb2338b6268d9aebe3850df66daed#r57748408

Note You need to log in before you can comment on or make changes to this bug.