Monday, November 22, 2010

Using JOTM as a TransactionManager in Neo4j

If you follow the neo4j mailing list you may have noticed that lately I have been working on providing support in Neo for plugging in JTA compliant Transaction Manager implementations. In this post I will try to explain how this was done, how you can enable JOTM support that is provided mainly as an example and what further improvements are possible after completing this project.

Flashback

During my series of posts for the internals of Neo, I had touched upon the whole XA business and how Neo comes with (complete, as it turned out) support for fitting into an XA environment. There I had described the various classes that expose the database store as an XAResource and how the TxManager class enlists it as needed in top level transactions, along with other XAResources, typically ones from the Index store. I will not explain more here, but it is a good idea to have in mind the general framework. The idea was that, since the TxManager implements the javax.transaction.TransactionManager interface, what keeps us from substituting it with a different, third party implementation? More importantly, how can it be done and how robust will the result be? If you want to see how this turned out and like to hear good news, keep reading. Note that the next paragraph is story telling, so if you want the main course, skip it.

Story time

One of the most used JTA compatible implementations of transaction managers out there is the objectweb's JOTM package. I prefered it over Atomikos' solution mainly because, being a newbie in the field, I wanted something that provided source code so it would help me with learning and debugging. A fine choice, as it turned out.
I had some teething problems, having to do with initializing the JOTM instance, enabling recovery, searching through versions for resolved bugs, but I came out victorious. After figuring all this out, I tore out the TxManager class from the TxModule in the Neo kernel and substituted it with a Jotm instance. No problems there, it worked first try. Enlisting the resources for recovery was a bit more trouble but I managed it, in a way that will be explained below. After that, I had to make sure that everything worked properly for recovery in all disaster scenarios. After getting bored of using the debugger to breakpoint/kill the JVM on commit()/rollback()/prepare(), after a suggestion by Tobias Ivarsson on IRC, I detoured a bit and hacked up a solution with JDI. After some (little) automation I had tested many scenarios with satisfactory precision and, as it turned out, all the hard work was already done by the Neo team - I did not have to touch not even one line of code to make things work out.
Now, of course, the kernel had a dependency on the JOTM package, which is unacceptable. I looked at the way Lucene is enlisted as a service automatically and using this framework I added a hook in the kernel to discover and register services that provide TransactionManager instances. Now one can create a .jar file, add it and its dependencies in the classpath and the kernel will pick it up and make it available for use. Neat! To verify the usability and stability of this, I branched the recovery-robustness suite that the dev team uses to check that crashes are not catastrophic when JOTM is used. For the time being, all works well.

Getting technical: The hooks

If you want to add support for a custom tx manager implementation, the first thing to look are the interface TransactionManagerImpl and the class TransactionManagerService. The first is there out of necessity - when the TxModule is started, the recovery process must be initiated. Each tx manager does it in its own way, so the init() method is there to abstract this concept. The XaDataSourceManager is passed to provide all XAResources that are registered with the kernel and which may have pending transactions. The stop() method is of course used to notify the tx manager of shutdown so that resources can be released and cleanup performed. The native implementation of Neo (TxManager and ReadOnlyTxManager) have been altered to implement this interface. In addition, it extends javax.transaction.TransactionManager so that it can be used "vanilla" after initialization.
The TransactionManagerService is a convenience class. It extends Service so that it can be discovered and loaded and implements the TransactionManagerImpl interface so that it can be used without a cast in the TxModule. The only thing defined is a constructor with a String argument, which is the name under which the Service will be registered - your implementation will be known by this name. To add a TransactionManager implementation as a service, you must extend this class, as it will be demonstrated by the JOTMServiceImpl.
The Config object has seen the addition of the constant TXMANAGER_IMPLEMENTATION, with a value of "tx_manager_impl". If you want your custom implementation of tx-manager-as-a-service to be used, you must start the EmbeddedGraphDatabase with a custom configuration that will contain at least this key with a value of the name of your service.

The sample implementation

Here you can see the project that holds the JOTM implementation of the above. It consists of only one class that extends TransactionManagerService, has a name of "jotm", on init() constructs a local Jotm instance and registers with it all XAResources reported by XaDataSourceManager, requests recovery and is done. All TransactionManager interface methods are passed to this local instance. stop() simply stops the Jotm instance. The annotation @Service.Implementation is from the org.neo4j.helpers package and is there to ensure that this class is registered. For the magic however to happen a META-INF/services/org.neo4j.kernel.impl.transaction.TransactionManagerService resource must exist, that will contain a single line with the fully qualified name of the service class. Bundle all that up in a jar, put it in your classpath, change your configuration parameters and bam! nothing will work. It is obvious that you must provide the bundles that consist the JOTM implementation and also, configure it right. So let's do that.

Making it play

The version I have used in all my tests is 2.1.9, I suggest you do the same. The jars needed are:

jotm-core-2.1.9.jar
which is the JOTM core impl and requires

avalon-framework-4.1.5.jar
commons-collections-3.2.jar
commons-logging-1.1.jar
commons-logging-api-1.1.jar
servlet-api-2.3.jar
log4j-1.2.16.jar
logkit-1.0.1.jar
jacorb-2.2.3-jonas-patch-20071018.jar
jacorb-idl-2.2.3-jonas-patch-20071018.jar
howl-1.0.1-1.jar
carol-3.0.6.jar
carol-interceptors-1.0.1.jar
irmi-1.1.2.jar
ow2-connector-1.5-spec-1.0-M1.jar
ow2-jta-1.1-spec-1.0-M1.jar


If you are a maven user, the dependencies to add are

<dependency>
    <groupId>org.ow2.jotm</groupId>
    <artifactId>jotm-core</artifactId>
    <version>2.1.9</version>
</dependency>
<dependency>
    <groupId>org.ow2.spec.ee</groupId>
    <artifactId>ow2-connector-1.5-spec</artifactId>
    <version>1.0-M1</version>
</dependency>


Also, you will need to provide a configuration for JOTM to use, necessary for recovery which is turned off by default. The location is passed via a JVM argument as

-Djotm.home=/path/to/directory


which must contain a jotm.properties file in which at least the property jotm.recovery.Enabled must be set to true. Here is a sample. Another property in there is howl.log.FileDirectory which is a path where the transaction WAL will be kept.
I confess to not have studied CAROL, so if you have any problems I suggest you use the whole configuration directory as provided here which works.

After that, adding the jotm-connector-service bundle as described above, jta kernel with correct parameters and all, you should get a working environment. The howl logs must appear and you must not have tm_tx_log.* files in your store directory, since the native TxManager is not used. If you want to use it instead, an empty configuration or a "tx_manager_impl" parameter with a value of "native" will do the trick. There you go, you used an alternative TransactionManager implementation, as promised. Go ahead, implement a long operation that involves indexing and kill it on commit(). When restarted, the store will be brought up to a consistent state. You can also alter the sample service project to bind the Jotm instance to a JNDI location and enlist resources from a different store altogether.

For maven users

Apart from the dependencies for jotm above, you can download and install the jta kernel and the JOTMService project. Add their coordinates to your pom.xml as in

<dependency>
   <groupId>org.neo4j.jta</groupId>
   <artifactId>jotm-service-provider</artifactId>
   <version>0.0.1</version>
</dependency>
<dependency>
   <groupId>org.neo4j</groupId>
   <artifactId>neo4j-kernel</artifactId>
   <version>1.2-jta</version>
</dependency>


substituting the main kernel for the jta one. Things should work.

From here

This whole thing lacks some safety measures, mainly the fact that the database is allowed to come up with a different tx manager than that which was used when it crashed. I think it should be added and the best location I can think is the neostore top level store file. I will have to get back to you on this.
I now know enough to go to the second phase of my work: make Neo work in an application server via JCA, providing container managed transactions in the same way JDBC resources do it. This way you can use your favorite graph database from an EJB side by side with a relational store and rest assured that the 2PC will work while you are safe from the details. How cool would that be? Until then, happy hacking.

3 comments:

  1. Very very cool. We hope to roll this into the main branch after Neoj4 1.2 is out, meanwhile keep up the great work Chris, this is highly interesting stuff!

    /peter

    ReplyDelete
  2. Have you tried Atomikos txn manager? - http://www.atomikos.com/Main/TransactionsEssentials

    It would be interesting to know the diff between JOTM and Atomikos.

    Or even JBoss Txn.

    ReplyDelete
  3. @Ashwin
    It would be a nice thing to see implemented. This was not my intention, however. I wanted to achieve two different things:
    1. See how it could be done and what it entails in terms of changes and provide support for pluggable solutions.
    2. Write a sample using it so that others can use it.

    I think that if someone wants to use external tms then she should be competent enough to use this guide to write her own extension - it shouldn't be that hard anyway.

    I have changed focus now somewhat towards getting neo to work in a container with externally managed txs - something that will be more widely used. After that - who knows, maybe it will be prudent to see, as you said, what differences exist between different tms.

    cheers.

    ReplyDelete