Terrastore, the open-source Java document database has just released 0.8.0. This release is one of the biggest to date in terms of new features, enhancements and bug fixes. Sergio Bossa, the lead developer and creator of the project was kind enough to provide us with notes about the release.
Map/Reduce processing is the first big new feature, bridging Terrastore data and distribution model with the well known parallel processing paradigm.
Terrastore map/reduce implementation targets all documents, or just a subset of documents specified by range, belonging to a single bucket, and is based on three phases: mapper, combiner and reducer. The mapper phase is initiated by the node which received the map/reduce request, the originator node: it locates the target documents and the nodes that hold them, then sends the map function to those node so that it can be applied in parallel on each node; the map function will take each target document as input argument, and return, for each document, a map of <key,value> pairs as output. Then, each remote node runs the combiner phase, aggregating its local map results and returning a partial map of <key,value> pairs. Finally, the originator node runs the reducer phase, aggregating all partial results.
So, Terrastore map/reduce provides two parallel phases, directly sent and executed on the nodes holding the target documents, and one serial phase executed on the originating node. It is very well suited for aggregating and extracting information from distributed data, leveraging the parallelized and collocated characteristics of the mapper and combiner phases.
Active Event Listeners
Terrastore events management infrastructure has been greatly enhanced in several ways, for example by providing access to more event-related data such as both the new and updated version of the changed document, but active event listeners really deserve the spotlights.
Terrastore event listeners can now work as active components in regard to database contents: that is, users can now implement event listeners and rise asynchronous actions that will change the database contents by putting or removing documents. This is an important feature because users will now be able to process events and also store the result as a Terrastore document. Moreover, documents stored (or removed) by event listeners will rise other events and possibly invoke other listeners, creating a processing chain with unlimited possibilities.
Active listeners could be used to keep denormalized data up-to-date, gather and store statistics data over user documents, or implement complex use cases such as web pages processing and indexing as described in the Google Percolator paper  (which inspired indeed Terrastore active listeners).
Adaptive Ensemble Scheduling
Terrastore ensemble is a deployment mode which links several Terrastore clusters together, making them work as a whole to improve overall scalability.
The adaptive ensemble scheduler provides a dynamic algorithm to pull node membership from external clusters, keeping the ensemble topology as correct and up-to-date as possible. The algorithm is based on a dynamic update frequency adjusted over time as a function of recent view changes: more specifically, the algorithm will take into account the frequency of changes and the number of changes as a function of the cluster size (because, a node departure happening in a cluster of two nodes is much more important than one happening in a cluster of tens nodes).
As a result, ensemble efficiency and stability will be greatly improved.
Document and Communication Compression
Terrastore heavily relies on heap memory: most notably, to store documents on server nodes and buffer received/sent data from/to other nodes.
So, Terrastore 0.8.0 provides optional document and communication compression. Thanks to document compression, uncompressed docs sent by clients will be stored in compressed form and decompressed back only when requested, taking particular care of avoiding unnecessary memory copies and decompressing only when actually needed.
Thanks to communication compression, server-to-server communication will happen in compressed form, cutting memory consumption and network bandwidth utilization, particularly important when receiving and processing large query results.
Both are independently enabled/disabled by a simple switch on the server startup arguments.
Performance, scalability and stability enhancements
Among the several enhancements, there are two major ones I'd like to cite here.
- A fine grained tuning of the Terracotta master, in particular to avoid excessive slow downs and out of memory errors.
- A rewritten lock management implementation, greatly improving performance and scalability when storing several millions documents.