Saturday, February 1, 2014

TLABs

I sometimes peruse the hotspoc-gc-dev lists for grins and saw this post on G1 garbage collecting:

"Unlike the other collectors G1 did not use dynamic sizes for the TLABs that the Java threads use for allocation. That means that the TLABs when using G1 were all fixed size and very small. This patch adds dynamic sizing of TLABs to G1 too..."

What are these mysterious TLABs?

"A Thread Local Allocation Buffer (TLAB) is a region of Eden that is used for allocation by a single thread.  It enables a thread to do object allocation using thread local top and limit pointers, which is faster than doing an atomic operation on a top pointer that is shared across threads." [1]

Or, to put it another way:

"In HotSpot, almost all application allocation is performed in the eden space of the young generation, using contiguous (aka "pointer-bumping") allocation. That is, there's a current pointer and an end address, and as long the next allocation fits we return the current pointer value and increment it by the allocation size.

"But on a multiprocessor, this isn't safe: two threads could both read the current alloc pointer value, and both update, and think they were using the same memory for different objects. So some form of synchronization is necessary. We could just take a lock, but we can do better by using an atomic hardware instruction, a compare-and-swap, or CAS (in the SPARC architecture; the same thing is called compare-and-exchange on x86). So in our race scenario, one thread's CAS succeeds and the other fails; the latter retries.

"This still has two problems: atomic hardware operations are expensive on most architectures, and on machines with many processors this could be a source of cache contention, making them even more expensive.

"So we avoid this with TLABs: each thread allocates a medium-sized chunk and saves it, allocating within it with no synchronization required. Only when its TLAB is used does it go back to allocating from the shared space." [2]

Detlef points out: the bigger the TLAB, the less often you must contend for shared memory but also the more young generation you will use [2].

There are JVM flags to manipulate them (-XX:TLABSize, -XX:-ResizeTLAB) and even to print summaries (-XX:+PrintTLAB) [1].

[1] Jon Masamitsu's Weblog.
[2] David Detlef's Weblog.

No comments:

Post a Comment