Sunday, January 9, 2011

JVM optimization

Want a quick and easy way to make your Java apps run faster? Upgrade your JVM.

Last year, I was using JDK 1.6.0_05 to run our app. I monitored it with YourKit and saw my application threads being blocked by the Reference Handler thread on the monitor of class java.lang.ref.Reference$Lock.

Upon upgrading to JDK 1.6.0_20, there was none of this particular contention. What's more, the app was running about 25% faster without me doing anything!

So, what was causing this contention?

In one sample, the JVM ran for less than 9 minutes and was only doing something “interesting” for about 2 of those yet the total time spent contending for this lock was about 111 seconds. This time was spread across roughly 50-60 threads and YourKit was saying that there are about 50 different methods that appear to be attempting to attain the lock. Of the handful of methods I looked at, I couldn’t see anything in the code that was obviously suspicious. I mean, it was literally blocking in a simple get method!

Let's see what happens when the JDK blocks. Let's take this code:



package com.henryp.lang;

public class ThreadWaitingMain {

public static void main(String[] args) {
final Object lock = new Object();
System.out.println("\nlock.hashCode = " + Integer.toHexString(lock.hashCode()) + "(" + lock.hashCode() + ")"); //

Thread firstThread = new Thread(createLockContender(lock));
firstThread.setDaemon(false);
firstThread.setName("Thread no. 1");
firstThread.start();


Thread otherThread = new Thread(createLockContender(lock));
otherThread.setDaemon(false);
otherThread.setName("Thread no. 2");
otherThread.start();

System.out.println("Main thread finishing");
}

private static Runnable createLockContender(final Object lock) {
return new Runnable() {

@Override
public void run() {
while (true) {
synchronized (lock) {
try {

System.out.println(Thread.currentThread().getId() + ":" + Thread.currentThread().getName() + ": Have lock. About to sleep.");
Thread.currentThread().sleep(1000L);
lock.notify();
System.out.println(Thread.currentThread().getId() + ":" + Thread.currentThread().getName() + ": Relinquishing lock...");
lock.wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}

};
}

}



Let's run it.

lock.hashCode = 1172e08(18296328)
Main thread finishing

.
.
.


Which PID is it?

[henryp@vmwareFedoraII Test]$ ps aux | grep java
henryp 12148 0.9 0.9 399372 10028 pts/6 Sl+ 22:13 0:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain
henryp 12168 0.0 0.0 4204 704 pts/3 S+ 22:14 0:00 grep java


OK, it's 12148.

By the way, running:

[henryp@vmwareFedoraII Test]$ ps -eLf | grep java

gives us not just the processes but all of the threads:

henryp 12148 11567 12148 0 12 22:13 pts/6 00:00:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain
henryp 12148 11567 12154 0 12 22:13 pts/6 00:00:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain
henryp 12148 11567 12155 0 12 22:13 pts/6 00:00:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain

.
.
.

But which ones are ours?

[henryp@vmwareFedoraII Test]$ jstack 12148 | grep "Thread no."
"Thread no. 2" prio=10 tid=0xb6a7d400 nid=0x2f84 waiting on condition [0x9f388000]
"Thread no. 1" prio=10 tid=0xb6a7bc00 nid=0x2f83 in Object.wait() [0x9f3d9000]


Very useful is our old friend jstack. Right, let's convert those hex values into decimal:

[henryp@vmwareFedoraII Test]$ printf "%d\n" 0x2f84
12164
[henryp@vmwareFedoraII Test]$ printf "%d\n" 0x2f83
12163


Now, let's see which Linux kernel commands are being called by these threads:

[henryp@vmwareFedoraII openjdk7]$ strace -p 12163 > ~/strace_12163.txt 2>&1 &
[henryp@vmwareFedoraII openjdk7]$ strace -p 12164 > ~/strace_12164.txt 2>&1 &


Let's tail one of these:

[henryp@vmwareFedoraII ~]$ tail -f strace_12164.txt
clock_gettime(CLOCK_MONOTONIC, {16852, 857324137}) = 0
clock_gettime(CLOCK_MONOTONIC, {16852, 857391406}) = 0
gettimeofday({1294611927, 576256}, NULL) = 0
clock_gettime(CLOCK_REALTIME, {1294611927, 576322753}) = 0
futex(0xb6a7e9a4, FUTEX_WAIT_PRIVATE, 1, {0, 999933247}) = -1 ETIMEDOUT (Connection timed out)
futex(0xb6a7e228, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {16853, 859389474}) = 0
futex(0x9f80090c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x9f800908, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0xb6a7c828, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xb6a7e96c, FUTEX_WAIT_PRIVATE, 689, NULL) = 0
futex(0xb6a7e028, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0xb6a7e028, FUTEX_WAKE_PRIVATE, 1) = 0


And it's this futex - "fast userspace mutex" - that we're interested in. It's how Linux does it's locking.

But that's where the story ends as I am not a Linux guru and need to do more research. So, I'm afraid this is an incomplete post. I wrote it to keep a record of what I am working on in my spare time.

1 comment:

  1. Thread.currentThread().sleep(1000L);
    should be written as
    Thread.sleep(1000L);

    ReplyDelete