Sunday, April 29, 2012

Freaky Leaks

In my last post, I described problems with sending ICMP packets in Java. As the lesser of the evils, we instead spawned a new process that used the command line ping. We did this using java.lang.ProcessBuilder. But when we had to fire off hundreds of requests per second, we ran into problems.

Firstly, on a Sun box it was slow. Firing up a new process was expensive as (so I am told) SunOS must initiate the memory for ping every time.

[Apparently, you can hope to improve this by setting the sticky bit on the file (eg, with chmod 1755 /tmp/test.sh). The sticky bit is "one of the status flags on a file that tells UNIX to load a copy of the file into the page file the first time it is executed. This is done for programs that are commonly used so the bytes are available quickly" (Unix Unleashed p1270). This is quite an old book and the Linux chmod man pages say "on some older systems, the bit saves  the program’s text image on the swap device so it will load more quickly when run".]

Anyway, it turns out that our software was to run on Linux which was much faster at repeatedly starting new processes anyway. However, under heavy load (but non-deterministically) we started seeing java.io.Exceptions saying Bad file descriptor and Too many open files.

Sure enough, running lsof -p PID showed lots of open pipes for our Java process. Hitting the Garbage Collection button on JConsole seemed to help but was didn't solve the mystery.

We were draining the stream from the Process (as outlined in a previous blog post) using a Stream Gobbler but we closed it after use. By trial and error, we looked at the other streams associated with this Process - the error and output streams. Although we never got the reference for them from the Process, we discovered that these need closing too. Why - we have no idea.

[The reason forcing Garbage Collection helped was that the particular instance of the abstract Process (UNIXProcess) for our environment creates a java.io.FileInputStream that has a finalize() method that closes an associated Channel (if there is one) as well as calling the native method to close the stream.]

This is one of those conditions where increasing the memory actually makes things worse. Without the Garbage Collecting reaping these references, the operating system eventually hits its limit for open files (see /proc/PID/limits).




No comments:

Post a Comment