Partitions, Backing Maps, and Services…Oh My!

Recently I was asked the following question:
What is the relationship between a partition, cache, and a backing map?
To answer this, first we should go through some definitions:
Cache: An object that maps keys to values. In Coherence, caches are referred to by name (thus the interface NamedCache.) Caches are also typically clustered, i.e. they can be accessed and modified from any member of a Coherence cluster.
Partition: A unit of transfer for a partitioned/distributed cache. (In Coherence terminology, partitioned and distributed are used interchangeably, with preference towards the former.)
Backing Map: Data structure used by a storage node to hold contents of a cache. For partitioned caches, this data is in binary (serialized) form.
I’ll add one more term for completeness:
Service: Set of threads dedicated to handling requests; a cache service handles requests such as put, get, etc. The DistributedCache service is present on all members of the cluster that read/write/store cache data, even if the node is storage disabled.
Now let’s put these concepts together to see how a clustered partitioned cache works.
First we’ll start with a backing map. It is a straight forward data structure, usually an instance of LocalCache:
The contents of this map are managed by a partitioned cache service. In a typical thread dump, this would be the “DistributedCache” thread:
"DistributedCache" daemon prio=5 tid=101a91800 nid=0x118445000 in Object.wait() [118444000]
A single cache service can manage multiple backing maps. This is the default behavior if multiple services are not specified in the cache configuration file. (Multiple partitioned cache services are beyond the scope of this post; this topic will be addressed in a future posting.)
So how do partitions come into play? Each backing map is logically split into partitions:
This splitting is used for transferring data between storage enabled members. Here are log file snippets of data transfer between two nodes:
Member 1:
(thread=DistributedCache, member=1): 3> Transferring 128 out of 257 primary partitions to member 2 requesting 128
Member 2:
(thread=DistributedCache, member=2): Asking member 1 for 128 primary partitions
As members are added to the cluster, the partitions are split up amongst the members. Note that this includes backup copies of each partition.
As of Coherence 3.5, it is possible to configure backing maps to physically split partitions into their own respective backing maps. This configuration may be desirable for backing maps that are stored off heap (i.e. NIO direct memory or disk) as it will make data transfer more efficient.





Superb patrick, diagrams like this are incredibly helpful. Ideally Coherence would have a ‘concepts’ manual like Oracle DB. Diagrams like yours would be there to give readers a mental model of what is happening.
phil wheeler
13 Jul 10 at 12:43 pm
You mentioned having multiple partitioned cache services, and said it would be covered in a future post. I am very interested in this topic, any help (via a post) would be appreciated!
Thanks.
Andy
20 Jul 10 at 8:50 am
@Phil – Thanks for the feedback! I think describing some of these concepts with diagrams goes a long way. It’s the next best thing to a live whiteboard session.
@Andy – Thanks for your feedback as well; I will try to write something up soon on configuring multiple services (and why you would/wouldn’t do it.)
Patrick Peralta
20 Jul 10 at 1:54 pm
Thank you for your posts. I am very interested in topic about multiple partitioned cache services too. Please, find time to post it
mikhail
25 Nov 10 at 5:33 am
Hi Patrick, Can i get all the backing maps associated to cache service? If yes how ?
Anil
10 Dec 10 at 6:44 pm
Hi Anil,
I don’t recall a public API to get all backing maps for a service. The easiest way to get hold of one for a cache is from an entry processor. If you cast InvocableMap.Entry to BinaryEntry, you can invoke getBackingMap().
Patrick Peralta
13 Dec 10 at 2:05 pm
Hi Patrick,
Yes I can get access to backing map but it will be only to backing map on a node where the entry processor is running. If my cache is running on multiple nodes how can i get access of all the backing maps?
I am implementing a concurrency for cache operation. I dont want to use cache.lock as it seems to be expensive performancewise. I will explain the problem. My cache is running on Node1 and Node2. The entryprocessor for cache key key1 runs on node1. This entryprocessor needs to access/update key2 of same cache. I cant access the key2 using cache.get inside entryprocessor as it will throw pollingexception.
So The only way to handle this problem is accessing backing map. But there is no guarentee that key2 will be on node1 , it can be on node2. So how can i access backing map of node2 inside entryprocessor running on node1? Or can you suggest any other way to handle this situdation?
Anil
14 Dec 10 at 7:28 pm
Hi,
Loved the article, especially the diagrams, a picture 1000 words…
I’m trying to track down a copy of the PPT please for your talk “Introduction to Coherence for Developers and Architects”.
I’m working on a large Canadian OFM project and believe that Coherence would be a great fit, so I’m looking more to guide.
can you help please ?
justi myatt
26 Aug 11 at 4:33 pm