Saturday, November 2, 2013

Grid Testing in a single JVM!

This week, I have been playing with a great, free, open source project called LittleGrid. You can run a whole cluster in one JVM, stopping and starting members with a single method call to emulate failover. This makes running tests as part of a continuous build process very easy and very nice.

It does all this cleverness by having a different class loader for each instance. This can cause some confusion when you see messages that basically say: ClassCastException: cannot cast class YourClass to class YourClass. Huh? Well, of course, a class is defined by its class loader not just its fully qualified name.

You can get around this by introspectively instantiating a helper class using a cluster member's class loader. This is how we configured a mock Spring framework for all the cluster members.

Since I am relatively new to Coherence, it was gratifying to sanity check some of its features. For instance, in Coherence you can add a map entry using a normal put:

import com.tangosol.net.NamedCache; 
.
.
.
    NamedCache cache = CacheFactory.getCache(CACHE_NAME);
    cache.put(key, value);

Or you could add something by invoking an Entry Processor (Coherence's equivalent of a databases Stored Procedure):

    EntryProcessor entryProcessor = new MyEntryProcessor(key, value); 
    Object         returned       = cache.invoke(key, entryProcessor); 

where my entry processor looks something like this:

class MyEntryProcessor implements Serializable, EntryProcessor {
.
.
.
    public Object process(Entry entry) { 
        BackingMapManagerContext    context     = getContext(entry); 
        Map         myCache     = context.getBackingMap(CACHE_NAME); 
        Binary                      binaryKey   = (Binary) context.getKeyToInternalConverter().convert(myKey); 
        Binary                      binaryValue = (Binary) context.getValueToInternalConverter().convert(myValue); 
        myCache.put(binaryKey, binaryValue); 
        return null;
    }

    protected BackingMapManagerContext getContext(Entry entry) {
        BinaryEntry                 binaryEntry = (BinaryEntry) entry;
        BackingMapManagerContext    context     = binaryEntry.getContext();
        return context;
    }
.
.
.

By judicious use of breakpoints, I can show that the thread that executes the entry processor blocks the put method call.

This is important in our project as we have code that extends the com.tangosol.net.cache.LocalCache and overrides the put method to do some magic sauce. This is a bit nasty as it's not a good separation of concerns and we're looking at refactoring it out. But there was a concern that the two threads may introduce a race condition. Thankfully, it appears it cannot.

[A cleaner design might have been to use listeners on the cache but in the early days of us using Coherence, the team didn't know which threads executed these listeners.

"A backing map listener ... is nothing more than a class that implements the MapListener interface. [T]hey are executed on the cache service thread which imposes a certain set of requirements and limitations on them.

"For one, just like entry processors, backing map listeners are not allowed to make a re-entrant call into the cache service that they are part of. That means that you cannot access from them any cache that belongs to the same cache service.

"Second, because they are executed synchronously on a cache service thread, it is of paramount importance that you do not do anything time consuming within the even handler... If you need to do anything that might take longer, you need to delegate it to Invocation Service, Work Manager or an external system.

"Finally, because backing map listeners are essentially the same mechanism that is used internally for backup of cache entries, the MapEvent instance they receive are not quite what you would expect and calls to getKey, getOldValue and getNewValue will return values in internal, serialized binary format."

- From Oracle Coherence 3.5].

Testing failover is much easier in LittleGrid:

int memberId = ...
ClusterMember clusterMember = memberGroup.getClusterMember(memberId);
clusterMember.shutdown();

which also gives us an opportunity to see data jumping from the backup store and into the LocalCache. By break pointing the overriden put method, you can see that this is how the data that the node was backing up adds it to its cache.

One last note: I'm currently working in the investment banking and we have the resources to pay for Coherence Enterprise edition. However, we're quite happy with the free version and have been getting good performance out of it. As a result, the tests we're running in our Continuous Integration environment are pretty much representative of what we can see in prod.

No comments:

Post a Comment