Saturday, October 5, 2013

When is a connection connected?

More production issues with Jetty, this time hit by this bug - a deadlock in the web server.

While diagnosing similar bugs, I wondered when a connection is associated with the Java server process. This is not as simple a question as it sounds. The client can make a connection to a server socket but Linux won't recognise the connection as belonging to the process until it is accepted.

Take this code:

SocketChannel channel = ((ServerSocketChannel)key.channel()).accept(); 

Now put a breakpoint on this line, run it and make a connection to it like:

mds@gbl04214[Skye]:~> !telnet 
telnet 128.162.27.126 10110 
Trying 128.162.27.126... 
Connected to 128.162.27.126. 
Escape character is '^]'. 
This is a load of bunkum 


You'll see the breakpoint gets hit.

> netstat -nap 2>/dev/null | grep 10110 
tcp        0      0 128.162.27.126:10110    :::*                    LISTEN      5711/java 
tcp       26      0 128.162.27.126:10110    128.162.27.125:55658    ESTABLISHED - 


(Note how netstat is reporting that there are 26 bytes in the buffer ready to be read, my message This is a load of bunkum).

Pass over the break point (ie, we've now called accept) and we see:

> netstat -nap 2>/dev/null | grep 10110 
tcp        0      0 128.162.27.126:10110    :::*                    LISTEN      5711/java 
tcp       26      0 128.162.27.126:10110    128.162.27.125:55658    ESTABLISHED 5711/java 


Now the OS associates the connection with our process.

So, if your server goes mad and stops accepting connections, it may seem as if the connections backing up are not making it to your process. This is wrong.

As an addendum, if you did the same thing with netcat rather than telnet, notice that the socket state is CLOSE_WAIT rather than ESTABLISHED. Netcat has apparently terminated and sent a FIN packet but the socket won't transition out of this state until the application closes the connection.

> echo this is rubbish | netcat  128.162.27.126 10110 

> netstat -nap 2>/dev/null | grep 10110 
tcp        0      0 128.162.27.126:10110    :::*                    LISTEN      5711/java 
tcp       17      0 128.162.27.126:10110    128.162.27.125:49854    CLOSE_WAIT  - 

 Still, there is data to read even if the client no longer exists.

No comments:

Post a Comment