Thursday, October 7, 2010

Neo4j Internals: Caching

Although Neo utilizes the NIO package with memory mapping (meaning that at least part of the db file store is in memory), it features a caching mechanism that is mainly useful for keeping primitives (nodes and relationships) in memory as internal, useful objects (recall NodeImpl and RelationshipImpl). Here I will take a shot explaining how that stuff works, diverging for a little from the Hello, World line-by-line walkthrough.

But first a little background

One of the most common cache management algorithms is the venerable Least Recently Used (LRU to its friends). It combines simplicity of implementation, fast operations and pretty good caching characteristics, although there are some (me, for instance) who swear by more advanced solutions (ARC anyone?). Another advantage of LRU is that it is almost implemented in the java.util package, as LinkedHashMap. What it misses is a proper implementation of removeEldestEntry(), which with a single line of code will give LRU characteristics to the class. But no more details, if you cannot follow, go do your homework and then come back.
Another way to look at LRU policy implementations specific to Java is with Weak/Soft references. These two posts provide a good introduction to the subject of non-strong references and discuss Maps with Soft and Weak references. In simple words, both WeakReference and SoftReference objects (actually, the values they refer to) are treated specially by the garbage collector, who reclaims their memory more aggressively. This behaviour is used in "poor man caches" that expand quickly to cover the heap but at the moment the memory is required for more important work, they are reclaimed. As a consequence, we have a way to greedily cache objects without hard, not adaptable upper limits while maintaining availability of memory for other tasks. This is the path chosen in the default implementation of the cache in neo.

Before the main event

First note that there are two caches, one for nodes and one for relationships, which apart from storing different classes, they can have different maximum sizes (via "max_node_cache_size" and "max_relationship_cache_size" configuration parameters), albeit not different implementations. There are 4 different kinds of cache implementations, all in package org.neo4j.kernel.impl.cache. There is the degenerate NoCache, which stores nothing. LruCache uses a good old LinkedHashMap with an overridden removeEldestEntry() which makes it an LRU cache and provides adaptability (more on that later). Finally, {Soft,Weak}LruCache provide the exact same implementation, with Weak and Soft interchanged, utilizing {Soft,Weak}Values to store references and ReferenceQueues to store garbage collected references. They both extend org.neo4j.kernel.impl.cache.ReferenceCache, which defines a pollClearedValues() method for removing "dead" entries from the map. By the way, this is the end of the code for adaptability for ReferenceCaches, meaning that their adaptation to the current memory utilization is whatever the Weak/SoftReference + the gc can offer (which is a lot). All management of the content of the caches is performed exclusively by NodeManager, while the actual memory consumption is determined by the AdaptiveCacheManager and is influenced by the user-supplied configuration.

The
AdaptiveCacheManager

The main work of AdaptiveCacheManager is to maintain the set of caches, parse and keep the configuration and provide adaptability of their size. This is achieved by spawning a worker thread that has a very simple life. Every "adaptive_cache_worker_sleep_time" milliseconds (3000 by default) it will wake up and re-adjust the sizes of the caches. For the ReferenceCaches it simply calls pollClearedValues(), while for the LruCache things are a little bit more complicated. Since the collection of over-the-quota entries must be done by hand, the adaptCache() method is run on every LruCache, its target size is recalculated based on the memory statistics reported by the JVM and passed to its resize() method, which drops elements in an LRU fashion until it reaches the target size. The exact heuristic and the configuration parameters that influence it are documented precisely here.


CacheManager Startup

One last thing remains to say, which is where in the startup chain the cache manager is created and started. The Config object creates and stores the AdaptiveCacheManager during its construction and passed it at the constructor of GraphDbModule. There it is passed to the constructor of NodeManager whose start() method provides it the caches for registration and then its start() method is called. There the adaptive cache worker thread is started and all the magic happens.


There you have it

This is the whole of the caching mechanism used in neo. It is mainly used to cache node and relationship implementations as they are retrieved from the persistent store and are completely hidden in the NodeManager. It is wholly contained in its own package and is indeed a simple piece of code to understand.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.

No comments:

Post a Comment