Sunday, March 30, 2014

The Many Faces of Traces

Most Java developers are familiar with jstack but if you're running on Linux, you might want to familiarise yourself with pstack too. This is especially useful if you are executing native code.

Let's create some native code we can call via JNI (see my last post if you want to compile it yourself):

#include <stdlib.h>
#include <stdio.h>
#include <jni.h>
#include "com_henryp_jni_MyJni.h"
#include <execinfo.h>
#include <signal.h>
#include <unistd.h>

JNIEXPORT jstring JNICALL Java_com_henryp_jni_MyJni_munge
  (JNIEnv* env, jobject obj, jstring str) {
        void *array[20];
        size_t size;
        jstring ret = 0;
        char* msg = "this is a test";

        printf("Java_com_henryp_jni_MyJni_munge called!\n");
        ret = (*env)->NewStringUTF(env, msg);

        printf("Hit return...\n");

        // get void*'s for all entries on the stack
        size = backtrace(array, 20);
        // print out all the frames to stderr
        backtrace_symbols_fd(array, size, STDERR_FILENO);

        return ret;

Here, we enter the C code, wait for a key press and print to stderr the non-Java stack trace by calling backtrace. This outputs:

[phenry@localhost MyTest]$ java -Djava.library.path=/mnt/hgfs/Shared/Code/JNI/MyJni/Debug -cp target/classes/ com.henryp.jni.MyJni
Java_com_henryp_jni_MyJni_munge called!
Hit return...

this is a test

If we allow the thread to tarry a little a little in this native method and run jstack we see our thread is runnable in the Java world even though it is blocking in the real world.

"main" prio=10 tid=0xb7705800 nid=0x3d46 runnable [0xb78a8000]
   java.lang.Thread.State: RUNNABLE
at com.henryp.jni.MyJni.munge(Native Method)
at com.henryp.jni.MyJni.main(

This is consistent with what RUNNABLE means in the Java world, name that it is "in the runnable state is executing in the Java virtual machine but it may be waiting for other resources from the operating system such as processor."

Where pstack is handy is that we don't need to write our own C code to examine the stack when it enters native code. We can take the Java thread's ID (0x3d46) and call pstack on it so:

[phenry@localhost My_AsmLocks]$ printf "%d\n" 0x3d46
[phenry@localhost My_AsmLocks]$ pstack 15686
Thread 1 (process 15686):
#0  0x00ec0424 in __kernel_vsyscall ()
#1  0x467da6db in read () from /lib/
#2  0x4682cbeb in _IO_old_file_underflow () from /lib/
#3  0x4677ed99 in _IO_default_uflow_internal () from /lib/
#4  0x4677ebaf in __uflow () from /lib/
#5  0x46779294 in getchar () from /lib/
#6  0x00a8f58f in myhandler () from /mnt/hgfs/Shared/Code/JNI/MyJni/Debug/
#7  0x00a8f627 in Java_com_henryp_jni_MyJni_munge () from /mnt/hgfs/Shared/Code/JNI/MyJni/Debug/
#8  0xb570a84d in ?? ()
#9  0xb570345a in ?? ()
#10 0xb57003d9 in ?? ()
#11 0x0038dd65 in JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) () from /usr/java/jdk1.7.0_07/jre/lib/i386/client/
#12 0x004b27d9 in os::os_exception_wrapper(void (*)(JavaValue*, methodHandle*, JavaCallArguments*, Thread*), JavaValue*, methodHandle*, JavaCallArguments*, Thread*) () from /usr/java/jdk1.7.0_07/jre/lib/i386/client/
#13 0x0038cb7f in JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*) () from /usr/java/jdk1.7.0_07/jre/lib/i386/client/
#14 0x003a062a in jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) () from /usr/java/jdk1.7.0_07/jre/lib/i386/client/
#15 0x003aae9b in jni_CallStaticVoidMethod () from /usr/java/jdk1.7.0_07/jre/lib/i386/client/
#16 0x00bc9384 in JavaMain () from /usr/java/jdk1.7.0_07/bin/../jre/lib/i386/jli/
#17 0x468a4a2e in start_thread () from /lib/
#18 0x467ea81e in clone () from /lib/

giving us much the same result.


When using jstack, you might have noticed the last line looks something like:

JNI global references: 114

This refers to how many global references still exist that were created in the JNI code [1]. By global I don't mean that were created with methods like this:

    for (i = 0 ; i < 10 ; i++) {
        ret = (*env)->NewStringUTF(env, msg);

This makes no difference to the count. But, replace the line with this [2]:

        jclass cls1 = (*env)->GetObjectClass(env, obj);
        cls = (*env)->NewGlobalRef(env, cls1);

and you quickly see the count rise:

$ watch "jstack `jps | grep MyJni | perl -pe s/\ .*//g` | grep JNI"

This check may be why JNI has a reputation for being slow (but that's perhaps only compared to the optimised code that JIT creates).

[1] Oracle JNI Spec.

JNI, C++ and Name Mangling

The Problem

JNI has been around for a long time but there isn't a huge amount of information out there when things go wrong. You might see a horrible exception like:

[phenry@localhost MyTest]$ java -Djava.library.path=/mnt/hgfs/Shared/Code/JNI/MyJni/Debug -cp target/classes/ com.henryp.jni.MyJni
Exception in thread "main" java.lang.UnsatisfiedLinkError: com.henryp.jni.MyJni.munge(Ljava/lang/String;)Ljava/lang/String;
at com.henryp.jni.MyJni.munge(Native Method)
at com.henryp.jni.MyJni.main(

which doesn't tell you an awful lot about what went wrong. To make matters worse, the problem can be with your C/C++ rather than your Java - making things much harder to debug.

The Code

The Java code is very simple.

package com.henryp.jni;

public class MyJni {
    static {
    private native String munge(String string);

    public static void main(String[] args) {
        MyJni app = new MyJni();
        System.out.println(app.munge("test message"));

Note: on Linux, the actual file name defined in loadLibrary is prepended with lib and appended with .so, so in this example, the JVM is looking for something called


javah -classpath target/classes -d target com.henryp.jni.MyJni

produces a file com_henryp_jni_MyJni.h that looks like:

/* DO NOT EDIT THIS FILE - it is machine generated */
/* Header for class com_henryp_jni_MyJni */

#ifndef _Included_com_henryp_jni_MyJni
#define _Included_com_henryp_jni_MyJni
#ifdef __cplusplus
extern "C" {
 * Class:     com_henryp_jni_MyJni
 * Method:    munge
 * Signature: (Ljava/lang/String;)Ljava/lang/String;
JNIEXPORT jstring JNICALL Java_com_henryp_jni_MyJni_munge
  (JNIEnv *, jobject, jstring);

#ifdef __cplusplus

Writing some C++ code that does nothing but print out what it was given and return a new string seemed straightforward:

#include <iostream>
#include "com_henryp_jni_MyJni.h"
using namespace std;

JNIEXPORT jstring JNICALL Java_com_henryp_jni_MyJni_munge
  (JNIEnv * env, jobject obj, jstring str) {
        printf("Java_com_henryp_jni_MyJni_munge called!\n");
        char* msg = "this is a test";
        return (*env)->NewStringUTF(env, msg);

and compiling it was a breeze:

g++ -I/usr/java/latest/include/linux -I/usr/java/latest/include/ -shared src/MyJni.cpp -o Debug/

(Note the -shared. This is telling it to not produce an executable but to "produce a shared object which can then be linked with other objects to form an executable." [1])

The Solution

So, why did I get the Error at the top of this post? Well, the JVM loads the library without error (it will barf on loadLibrary if there is something wrong with the shared object). So, the problem is with the method. I never see it print out this is a test so it never gets that far. It must be with the method definition itself.

[phenry@localhost MyJni]$ nm Debug/ | grep munge
0000060d t _GLOBAL__sub_I__Z31Java_com_henryp_jni_MyJni_mungePPK19JNINativeInterface_P8_jobjectS4_
00000598 T _Z31Java_com_henryp_jni_MyJni_mungePPK19JNINativeInterface_P8_jobjectS4_

Hmm, the method Java_com_henryp_jni_MyJni_munge doesn't actually exist in my shared object. This is due to C++ using name mangling to allow multiple usage of a method name in different namespaces.

Writing my shared object in C and compiling with:

gcc -I/usr/java/latest/include/linux -I/usr/java/latest/include/ -shared src/MyJni.c -o Debug/

solved the problem. Now, I can see my method.

[phenry@localhost MyJni]$ nm Debug/ | grep munge
00000468 T Java_com_henryp_jni_MyJni_munge

This answer on Stackoverflow shows how to write your code in C++ and link it to a C implementation of the autogenerated JNI header file. The author doesn't say why he prefers to compile JNI in C but this could be why.

Monday, March 17, 2014

Mind Reading a JVM on Linux

I'm currently looking at what appears to be a memory leak in a Java process that uses native Tibco libraries.

To help me see what's going on, I discovered I could read the process's memory in Linux. I could do this on a machine running the 3.6.10 kernel but had trouble on the 2.6.16 kernel.

It appears that the memory of a process is available at:


So, running this program:

package com.phenry.memory;


public class AnalyserMain {

    public static void main(String[] args) {
        try {
        } catch (IOException e) {

    private static void read(String[] args)
        throws FileNotFoundException,
        IOException {
        String              filename            = "/proc/" + args[0] + "/mem";
        RandomAccessFile    randomAccessFile    = new RandomAccessFile(new File(filename), "r");
        byte[]              b                   = new byte[1024 * 64];
        long                offset              = Long.parseLong(args[1], 16);;
        int                 off                 = 0;
       , off, b.length);
        System.out.println("output = " + new String(b));


with the PID of my Java process (2801 in this case) as an argument and an address taken from pmap, so:

[henryp@corsair MyMemAnalyser]$ pmap 2801 | grep anon | head
000000000095a000    132K rw---    [ anon ]
00000006f0000000 120832K rw---    [ anon ]
00000006f7600000  28160K -----    [ anon ]

I could see this:

[henryp@corsair MyMemAnalyser]$ java -classpath bin com.phenry.memory.AnalyserMain 2801 000000000095a000 | strings 
output = 

I've not found the cause of the memory leak yet but this takes me closer.

UPDATE 1: On some systems, you need to execute this as root:

echo 0 > /proc/sys/kernel/yama/ptrace_scope

as they have "hardened" kernels for security reasons (for instance, Ubuntu). Note: you are making your system more vulnerable if you have to do this.

UPDATE 2: The memory leak appeared to be coming from native Tibco libraries. Updating from 8.1 to 8.4 solved the problem.

Memory leaks in a native library?

Everybody should be proficient with examining Java heaps. However, if your Java process uses native libraries (Tibco in our case), diagnosing memory leaks becomes harder.

These are a few notes I've made when trying to ascertain what is going on when our 4g Java heap looks fine but our off-heap usage (as shown by the pmap Linux command) has now approached 12g!

Say we wanted to know what our process was linked to. We can do this with:

[phenry@localhost MemLeaker]$ ldd leaker =>  (0x00bf0000) => /usr/lib/ (0x47007000) => /lib/ (0x46903000) => /lib/ (0x468cc000) => /lib/ (0x46710000)
/lib/ (0x466ee000)

where leaker is an executable I created in C++.

Say, my program creates a thread using pthread_create. Where does this thread live? Well, going through those above files that my executable links, I found the one by doing this:

[phenry@localhost MemLeaker]$ nm -D /usr/lib/ | grep thread | grep create
         w pthread_create
         w pthread_key_create

The nm command lists the symbols from the object files.

Now, say I was to dump the memory of a process (it need not be a JVM but let's say this one is) I would do this:

[henryp@corsair DumpMemory]$ gdb -p YOUR_PID_HERE
(gdb) gcore /hdda1/java.core

(The gdb is the GNU debugger. Think of it as attaching a Java debugger to a JVM).

Note: this file is big. It was 17g for a small Java program so make sure you have disk space.

Now examine it with:

[henryp@corsair DumpMemory]$ gdb --core=/hdd1/java.core /usr/java/jdk1.7.0_51//bin/java
(gdb) bt
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:218
#1  0x00007f4656e4d2f8 in os::PlatformEvent::park(long) () from /usr/java/jdk1.7.0_51/jre/lib/amd64/server/
#2  0x00007f4656e4e044 in os::sleep(Thread*, long, bool) () from /usr/java/jdk1.7.0_51/jre/lib/amd64/server/
#3  0x00007f4656cb6612 in JVM_Sleep () from /usr/java/jdk1.7.0_51/jre/lib/amd64/server/
#4  0x00007f464d012cd8 in ?? ()
#5  0x00007f46566377d0 in ?? ()
#6  0x00007f4656637820 in ?? ()
#7  0x00007f46566377c0 in ?? ()
#8  0x0000000000000000 in ?? ()

Picking a thread at random, we see it's in JVM_Sleep in the libjvm shared object. Let's just check:

[henryp@corsair DumpMemory]$ objdump -t /usr/java/jdk1.7.0_51/jre/lib/amd64/server/ | grep JVM_Sleep
000000000067d2d0 g     F .text 000000000000045c              JVM_Sleep

"The text section of a program is where the program instructions live." [1]

If you want to look for text at these addresses, you need something like this:

(gdb) x/1000c 0x00007fb22c000000
0x7fb22c000000: 32 ' ' 0 '\000' 0 '\000' 44 ',' -78 '\262' 127 '\177' 0 '\000' 0 '\000'
0x7fb22c000008: 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0x7fb22c000010: 0 '\000' 16 '\020' 2 '\002' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0x7fb22c000018: 0 '\000' 16 '\020' 2 '\002' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'

The syntax is:

x/nfu a

where n is the number of u units with format f starting at address a.

You can also search the memory for a particular string with:

(gdb) find /b 0x00601000, 0x7FFF175E900, 'P', 'h', 'i', 'l', 'l'

Where the syntax can be found here.

Alternatively, you can dump the memory to a file and then run it through the strings command to make it  easier to read with:

(gdb) dump binary memory FILE_NAME START_ADDRESS END_ADDRESS

More information on core dumps can be found here.

Saturday, March 15, 2014

Setting up an environment to look at Linux source code

Just a quick note to remind myself of what to do when I look at Linux source files. In this case, I wanted the glibc source code. First, find the version you need:

[root@localhost rpmbuild]# rpm -q glibc

Download the associated source file and install it. If you get warnings about the mockbuild user, run:

yum install mock
useradd -s /sbin/nologin mockbuild

What frustrated me for a while was where my files were after I ran rpm -ivh RPM_FILE

First, I ran this to see what files I was playing with:

[root@localhost Shared]# rpm -q -filesbypkg -p glibc-2.14.1-6.src.rpm
glibc                     glibc-2.14.1-fedora.tar.xz
glibc                     glibc-2.14.1.tar.xz
glibc                     glibc-fedora.patch
glibc                     glibc-ia64-lib64.patch
glibc                     glibc-ports-2.14.1-1-g3eb1dbf.tar.xz
glibc                     glibc-rh767696.patch
glibc                     glibc-rh794797.patch
glibc                     glibc.spec

But they didn't seem to have installed anywhere. Finally, I worked out that the files get put:

[root@localhost Shared]# ls -ltra ~/rpmbuild/SOURCES/
total 21972
-rw-rw-r--. 1 mockbuild mockbuild 10120512 Oct  7  2011 glibc-2.14.1.tar.xz
-rw-rw-r--. 1 mockbuild mockbuild   407660 Oct 28  2011 glibc-2.14.1-fedora.tar.xz
-rw-rw-r--. 1 mockbuild mockbuild   496968 Oct 28  2011 glibc-ports-2.14.1-1-g3eb1dbf.tar.xz
drwxr-xr-x. 4 root      root          4096 Dec 25  2011 ..
-rw-r--r--. 1 mockbuild mockbuild     8977 Feb 20  2012 glibc-rh794797.patch
-rw-r--r--. 1 mockbuild mockbuild     2789 Feb 20  2012 glibc-rh767696.patch
-rw-r--r--. 1 mockbuild mockbuild     4294 Feb 20  2012 glibc-ia64-lib64.patch
-rw-r--r--. 1 mockbuild mockbuild   143612 Feb 20  2012 glibc-fedora.patch
drwxr-xr-x. 2 root      root          4096 Mar 15 06:50 .

How frustrating. Follow this link to see how to apply the patches.


Forgot to mention: if you want to step into the code using Eclipse, you'll need to add the debug information. This is how I did it for glibc:

First, get the version information:

[henryp@corsair ~]$ rpm -qa glibc

Then download and install it with:

[henryp@corsair ~]$ sudo debuginfo-install glibc-2.15-59.fc17.x86_64

Next time you debug (I'm using Eclipse IDE for C/C++ Developers, Kepler SR, Build id: 20140224-0627), you can step into glibc code!

Monday, March 10, 2014

Locks and 'Implausible' Behaviour

In a recent bug, we found some dodgy code that, although it didn't hold the lock it needed, should have worked most of the time. That is, it appeared that another thread would have had to have seized the lock and relinquished it thousands of times while the first thread just executing a simple line. This at first seemed implausible to us.

So, we did an experiment. The code below creates 2 threads that contend for a lock. Once they have the lock, they increment a counter and make a kernel call. A third thread gets the data every second and spits it out.

This managing thread executes code like this:

    private final Lock lock = new ReentrantLock();

    private void doTest() {
        Locker locker1 = new Locker(lock);
        Thread thread1 = new Thread(locker1);

        Locker locker2 = new Locker(lock);
        Thread thread2 = new Thread(locker2);



        while (true) {

While the contending threads execute code like this:

class Locker implements Runnable {

    private final Lock lock;

    private long       count;

    private Locker     other;

    private long       lastOther;
    private long       maxDiff;

    private long       diff;

    private long       doNotOptimizeAway = currentTimeMillis();

    public void run() {
        while (true) {

    void increaseCount() {

        if (other.count == lastOther) {
        } else {
            diff = 0; // start again

        maxDiff            = max(diff, maxDiff);
        // swap counts
        lastOther          = other.count;
        other.lastOther    = count;
        // reset the other because I won this time
        other.diff         = 0;
        doNotOptimizeAway ^= currentTimeMillis(); // kernel call

This code was run on a 16-core (using hyperthreading) E5620 (2.4GHz) Linux box running a kernel.


The total of the 2 monotonically incrementing counters for Java 1.6 over five runs was:

74 609 025
73 729 561
73 284 303
77 254 600
78 838 417

whereas Java 1.7 yields

92 043 050
93 529 925
92 779 311
92 587 899
90 993 539

which indicates that Java 1.7 handles locks with greater efficiency as more work was done.


The difference between the two threads' counters in Java 1.6 is:


(in units of 10 000).

For Java 1.7 it was:


which suggests Java 1.7 is less fair when arbitrating lock contention.


One of the things I tried to measure is how many times a lock can be grabbed in a row before the other contending thread seizes it back. The output looks like:

maxDiff = 6097
maxDiff = 6617
maxDiff = 5366
maxDiff = 7593
maxDiff = 10630
maxDiff = 5318
maxDiff = 7593
maxDiff = 11025
maxDiff = 31317
maxDiff = 10273

which suggests a lock can be held for thousands of consecutive calls before the other thread had a look-in.


Implausible as it sounds, race conditions that may look staggeringly unlikely in your code can be quite common as the JVM 'batches' lock operations to make the overall performance much more efficient.

Sunday, March 9, 2014

Scala Crib Sheet #1


"Here is a function that identifies the .scala files and stores them in an array:

def scalaFiles = 
    for {
        file <- filesHere
        if file.getName.endsWith(".scala")
    } yield file

"Each time the body of the for expression executes it produces one value, in this case simply file. When the for expression completes, the result will include all of the yielded values contained in a single collection. The type of the resulting collection is based on the kind of collections processed in the iteration clauses. In this case the result is an Array[File], because filesHere is an array and the type of the yielded expression is File." [1]


"When you write a pattern match, you need to make sure you have covered all of the possible cases. Sometimes you can do this by adding a default case at the end of the match, but that only applies if there is a sensible default behaviour... The alternative is to make the superclass of your classes sealed. A sealed class cannot have any new subclasses added except the ones in the same file."

Repeated Parameters (eg, def echo(args: String*)) "Scala allows you to indicate that the last parameter to a function may be repeated. This allows clients to pass variable length argument lists to the function". Appears to be equivalent to ellipses in Java.

Covariance and Contravariance 

"Whenever you write the function A=>B, Scala expands this to Function1[A,B]... the Function1 trait is contravariant in the function argument type S and covariant in the result type T as show:

trait Function1[-S, +T] {
  def apply(x: S): T

See here for an explanation of how contravariance and covariance work when overriding methods in Java.


"The compiler first tries to compile it as is, bit it sees a type error. Before giving up, it looks for an implicit conversion that can repair the problem.. it tries the conversion method , sees that it works, and moves on...

"Only definitions marked implicit are available.

"An inserted implicit conversion must be in scope as a single identifier, or be associated with the source or target type of the conversion.

"Only one implicit is tried... For sanity's sake, the compiler does not insert further implicit conversions when it is already in the middle of trying another implicit... It's possible to circumvent this restriction.

"Whenever code type checks as it is written, no implicits are attempted. The compiler will not change code that already works.

self => 

"Sometimes it’s useful for a trait to be able to use the fields or methods of a class it is mixed in to, this can be done by specifying a self type for the trait. A self type can be specified for a class or a trait as follows:

trait SpellChecker { self =>

"... The syntax can be extended to specify a lower-bounds ...

trait SpellChecker { self: RandomAccessSeq[char] =>

"The compiler will check that any class in a hierarchy including SpellChecker is or extends RandomAccessSeq[char], so SpellChecker can now use the fields or methods of RandomAccessSeq[char]." [2]


Using the class version of primitives can be expensive in JVMs so Scala allows this annotation to make things more efficient. It does this by generating more than one class - one for the Object implementation and one for the primitive [3].

This was added in Scala 2.8 and isn't to be found in Odersky's Programming in Scala book.

[1] Programming in Scala - Odersky
[2] Scala: Traits and Self Types - Mark Thomas
[3] Specializing for Primitive Types - Scala Notes