Bug 4735 - congestion control misbehaves with large frame updates
Summary: congestion control misbehaves with large frame updates
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: VNC (show other bugs)
Version: trunk
Hardware: PC Unknown
: P2 Normal
Target Milestone: 4.10.0
Assignee: Pierre Ossman
URL:
Keywords: relnotes
Depends on: 7158
Blocks: performance 7800
  Show dependency treegraph
 
Reported: 2013-07-01 11:42 CEST by Pierre Ossman
Modified: 2021-11-29 17:23 CET (History)
0 users

See Also:
Acceptance Criteria:
* Performance should be equal, or better, compared to an old server for all network conditions * Frame rate (and bandwidth) should reach the maximum faster compared to the old server (mostly noticeable with high latency)


Attachments
Measurement for current code (347.00 KB, text/csv)
2017-03-31 11:08 CEST, Pierre Ossman
Details
Measurement for new code (679.94 KB, text/csv)
2017-03-31 11:09 CEST, Pierre Ossman
Details
Gnuplot script (149 bytes, text/plain)
2017-03-31 11:09 CEST, Pierre Ossman
Details

Description Pierre Ossman cendio 2013-07-01 11:42:54 CEST
Currently a VNC screen update is sent all in one go. This is problematic when we have large updates as it messes with the congestion control. The congestion control relies on "ping" packets between the client and server. If there is too large delay between these packages, it will revert back to 1 update per RTT.

We should introduce a mechanism to split up framebuffer updates over several messages. That way we could interleave other commands between the pieces.

One easy way to solve this is to introduce a new "end of frame" pseudo-encoding. The client would then look for such a rect instead of triggering on the FBU message.


One slight implementation problem that needs to be solved is the timestamps of the interleaved fence messages. They will most likely all be the same, even thought they will be gradually pushed onto the pipe, giving an artificially inflated RTT.
Comment 1 Pierre Ossman cendio 2013-07-01 11:43:37 CEST
We probably also need to solve bug 4734 for this to have the desired effect.
Comment 2 Pierre Ossman cendio 2014-12-03 19:33:58 CET
Thinking about this a bit more, splitting up the updates is probably not the answer, or at least not the whole answer.

In a "normal" congestion control system, you get lots and lots of feedback. TCP gets more or less an ACK every 1500 bytes. I don't think we can split things up that much without getting clobbered by the overhead.

A better alternative is to improve the congestion control handling:

Instead of doing the whole fire-and-forget we do now, we instead keep track of every congestion marker we send out. When we get one back we can look at the next expected marker (that's still on the wire), compare the time stamps and data counters and make a qualified guess as to how much data has already been consumed. That way we only need two markers per RTT.

Another upside of this is that we don't need to fill the markers with a lot of data, reducing their overhead to just a few bytes.
Comment 3 Pierre Ossman cendio 2016-10-10 15:53:45 CEST
An improved version of the congestion control system is available here:

https://github.com/CendioOssman/tigervnc/tree/congestion

Works well for the obvious cases, but needs more testing to make sure there aren't any corner cases that misbehave.
Comment 4 Pierre Ossman cendio 2017-03-31 11:03:59 CEST
There's a PR now which will hopefully get some testing:

https://github.com/TigerVNC/tigervnc/pull/442
Comment 5 Pierre Ossman cendio 2017-03-31 11:08:30 CEST
Created attachment 782 [details]
Measurement for current code

Test run with the existing code and dump of relevant stats:

 - Throttled to 1 Mbps and 1500 ms RTT (BDP 183 KiB)
 - Started glxgears in fullscreen over 1024x768
 - Paused it a long period (trigger close of congestion window)
 - Paused it a short period

The columns show our computed congestion window and the computed in flight data, and the same values as reported by the underlying socket.
Comment 6 Pierre Ossman cendio 2017-03-31 11:09:04 CEST
Created attachment 783 [details]
Measurement for new code

Same measurement with the suggested changes.
Comment 7 Pierre Ossman cendio 2017-03-31 11:09:49 CEST
Created attachment 784 [details]
Gnuplot script

Visualise the data using:

 $ gnuplot -c plot.gp before.csv
Comment 9 Pierre Ossman cendio 2018-09-24 16:28:08 CEST
The above files show the technical details, and how the internal calculations are better.

But from a user point of view the differences are not that noticeable. When the latency is low things are the same as before (as it isn't congested). If the bandwidth is insufficient then we also get the same behaviour (the old code relied on the fallback hack though). A small win that we can get the same without hacks though.

However in the case there is sufficient bandwidth but high latency, then the new code shows an improvement. It is able to reach the maximum bandwidth much faster now. Mostly you see this when some kind of animation starts, but you can also see it in the middle of videos if there is a prolonged "fade to black". At that point the frame rate drops again, but the new code picks up speed much faster.
Comment 10 Pierre Ossman cendio 2018-09-24 16:29:40 CEST
> * Performance should be equal, or better, compared to an old server for all network conditions

I could not find a scenario where it was worse.

> * Frame rate (and bandwidth) should reach the maximum faster compared to the old server (mostly noticeable with high latency)

I need 100+ ms to see this, but it does indeed happen. It becomes more noticeable the higher the latency is.

Note You need to log in before you can comment on or make changes to this bug.