Sunday, November 24, 2013

To Nagle or Not

Some TCP properties Java programmers have control over, others they don't. One optimisation is in Socket.setTcpNoDelay, which is to do with Nagle's algorithm (note: calling setTcpNoDelay with true turns the algorithm off). Basically, this tells the OS to batch your packets.

When you should turn it on or off depends very much on what you are trying to do [1]. Jetty sets no delay to true (that is, turns the algorithm off):

Phillips-MacBook-Air:jetty-all-8.1.9.v20130131 phenry$ grep -r setTcpNoDelay .
./org/eclipse/jetty/client/SelectConnector.java:            channel.socket().setTcpNoDelay(true);
./org/eclipse/jetty/client/SocketConnector.java:        socket.setTcpNoDelay(true);
./org/eclipse/jetty/server/AbstractConnector.java:            socket.setTcpNoDelay(true);
./org/eclipse/jetty/server/handler/ConnectHandler.java:            channel.socket().setTcpNoDelay(true);
./org/eclipse/jetty/websocket/WebSocketClient.java:        channel.socket().setTcpNoDelay(true);

Playing around with my own simple server, I experimented with this value. I set up a single thread on a 16-core Linux box using Java NIO that services requests sent from 2 MacBooks that each had 100 threads using normal, blocking IO and sending 10 010 bytes of data (the server replies with a mere 2 bytes).

Setting the algorithm on or off on the server made no discernible difference. Not surprising as 2 bytes are (probably) going to travel in the same packet. But calling socket.setTcpNoDelay(false) on the clients showed a marked improvement. Using a MacBook Pro (2.66GHz, Intel Core 2 Duo) as the client, the results looked like:

socket.setTcpNoDelay(true)

Mean calls/second:      3884
Standard Deviation:     283
Average call time (ms): 29

socket.setTcpNoDelay(false)

Mean calls/second:      5060
Standard Deviation:     75
Average call time (ms): 20

Your mileage my vary.

The big difference was the time it took to call SocketChannel.connect(...). This dropped from 20 to 13 ms.

As an aside, you can see Linux's network buffers filling up with something like:

[henryp@corsair ~]$ cat /proc/net/tcp | grep -i 22b8 # where 22b8 is port 8888 on which I am listening
   2: 5E01A8C0:22B8 00000000:0000 0A 00000000:00000012 02:0000000D 00000000  1000        0 1786995 2 ffff880f999f4600 99 0 0 10 -1                   
   3: 5E01A8C0:22B8 4101A8C0:F475 03 00000000:00000000 01:00000062 00000000  1000        0 0 2 ffff880f8b4fce80                                      
   4: 5E01A8C0:22B8 4101A8C0:F476 03 00000000:00000000 01:00000062 00000000  1000        0 0 2 ffff880f8b4fcf00 
.
.
  76: 5E01A8C0:22B8 5B01A8C0:D3A5 01 00000000:0000171A 00:00000000 00000000  1000        0 4035262 1 ffff880fc3ff8700 20 3 12 10 -1                  
  77: 5E01A8C0:22B8 5B01A8C0:D3AD 01 00000000:0000271A 00:00000000 00000000     0        0 0 1 ffff880fc3ffb100 20 3 12 10 -1                        
  78: 5E01A8C0:22B8 5B01A8C0:D3B4 01 00000000:00000800 00:00000000 00000000     0        0 0 1 ffff880e1216bf00 20 3 12 10 -1                        
  79: 5E01A8C0:22B8 5B01A8C0:D3A8 01 00000000:0000271A 00:00000000 00000000     0        0 0 1 ffff880fc3ff9500 20 3 12 10 -1                        
  80: 5E01A8C0:22B8 5B01A8C0:D3AC 01 00000000:0000271A 00:00000000 00000000     0        0 0 1 ffff880fc3ffe200 20 3 12 10 -1                        
  81: 5E01A8C0:22B8 5B01A8C0:D3B3 01 00000000:0000271A 00:00000000 00000000     0        0 0 1 ffff880e12169c00 20 3 12 10 -1                        
  82: 5E01A8C0:22B8 4101A8C0:F118 01 00000000:00000000 00:00000000 00000000  1000        0 4033066 1 ffff880e1216d400 20 3 0 10 -1 

Note 271A is 10010 - the size of our payload.

[1] ExtraHop blog.

No comments:

Post a Comment