Wednesday, May 23, 2012

SPDY/3 Flow Control Comparisons

‹prev | My Chain | next›

Last week, I compared Chrome and Firefox spdy/3 implementations. The result was inconclusive because I was using a sample page too far removed from the norm. Tonight, I revisit that comparison.

I again start with an artificial round trip time (RTT) of 100ms:
➜  spdybook-site git:(master) ✗ sudo tc qdisc add dev lo root netem delay 50ms        
[sudo] password for chris: 
➜  spdybook-site git:(master) ✗ ping localhost
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=1 ttl=64 time=100 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=2 ttl=64 time=100 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=3 ttl=64 time=100 ms
...
It is not quite realistic, but it is close enough and has the virtue of making math easier.

As I did last night, I use a web page that includes roughly 50 smallish images and other resources (85 is typical):


For each trial, I stop the node-spdy (spdy-v3 branch) server and close the browser before making a connection. Ideally this will make for consistently cold TCP/IP pipes over which the SPDY connections will be made.

I run a regular Chrome connection over spdy/3:


Then a Chrome connection over spdy/3 with spdy server push:


And, lastly Firefox over vanilla spdy/3 (Firefox does not support push):


I had expected to find that the Firefox connection is faster than the vanilla Chrome connection. New in spdy/3 is flow control, which I thought would make the difference. But upon reflection, flow control should not enter into this—none of the images are large enough to threaten SPDY's default receive window (64kb). In other words, there should be no flow control.

So what is the difference? I suspect that the difference is that Firefox is requesting more resources initially, pushing the TCP/IP CWND up faster. This would seem to be born out by virtue of the first large packets in each starting at the 0.7 second mark for Firefox and nearly 1.0 seconds for Chrome. Last night's Speed Tracer diagram would also suggest that Chrome is processing the <head> items before requesting the images from the <body>:


Regardless, I had wanted to compare more realistic spdy/3 implementations. For that, it seems that I need to add a couple of images that push the bounds flow control past the default receive window. I add them to the bottom of the <body> to get them loading last:


Now I repeat the trials for Chrome with vanilla spdy/3:


Chrome with spdy/3 and spdy server push:


And, lastly, Firefox over spdy/3:


This is what I had expected: Firefox trounces Chrome's flow control implementation. With a SPDY receive window of 256mb, Firefox effectively eschews spdy/3 flow control. The SPDY server is free to jam as much data as it wants on the wire. In this case, it completely fills the TCP/IP receive window (the top lines in the graphs).

With Chrome, on the other hand, the SPDY server is forced to wait for WINDOW_UPDATE frames before it can send back the next batch of data. Since Chrome never updates the SPDY receive window by more than 32kb, the server is unable to send enough data to push the bounds of the TCP/IP receive window.

The result is stark. Firefox completes transfer in less than 2 seconds whereas Chrome is twice as slow. The SPDY push is even worse because of a pause to transfer non-pushed resources (I am not pushing Javascript and CSS).

Chrome's current implementation seems clearly sub-optimal. In spdy/2 (and in vanilla HTTP), Chrome is more than capable of handling large TCP/IP receive windows. There is no reason for Chrome to continue to request WINDOW_UDPATES of 32kb—instead Chrome should be asking for near the TCP/IP receive window's worth of data.

That said, from the server's point of view, this implementation might be regarded as beneficial. The server is able to jam the smallish resources back to the browser in short order. The larger items, which the user would likely not expect to render immediately anyway, can be sent back at a more leisurely pace—freeing up the server to respond to other SPDY sessions.

Regardless, it is not the responsibility of the client to throttle the server's response like this. The server is more than capable of doing it on its own. So I expect that Chrome's WINDOW_UPDATE algorithm will get a little more sophisticated in the near future.


Day #395

1 comment:

  1. To be fair to chrome, you have an absurdly high bdp path there because of the ~infinite bandwdith of localhost combined with the injected delay. That is going to exaggerate the effect.

    otoh these kinds of idle round trips you're seeing to fix the window are exactly the kind of phenomenon spdy/2 eliminated through parallelism that make it work so well. So there is irony in spdy/3 introducing new ones:)

    you will eventually see firefox make use of flow control in conjunction with server push and probably eventually with media streams. it certainly has niche uses on the client side and I'm glad to see it be part of the protocol - I think you're just seeing how easy it is to configure it in ways that slow things down in unexpected places. It needs some tweaking as a protocol element to make it more robust.

    ReplyDelete