Florida JUGs Next Week
I will be in central Florida next week presenting at the following user groups:
An Introduction to Data Grids for Database developers (GatorJUG June 23rd)
This talk will introduce the concept of data grids to developers that have experience with Java EE and relational databases such as Oracle. The programming model will be explored (including caching patterns and similarities to NoSQL) as well as the performance & scalability improvements a data grid offers.
Taking a distributed system from development into a working production environment is a challenge that many developers take for granted. This talk will explore these challenges, especially scenarios that are not typically seen in a development setting.
I’m especially excited about the OJUG talk as I think it will cover many topics of interest to Developers and OPS guys. It is a set of general guidelines that came about from seeing dozens of Coherence applications in production. We will cover such things as:
- What to look for when using vmstat
- Must-have production level JVM settings/flags
- Developer Do’s and Dont’s
- Crash course on thread dumps and heap dumps
We will also be giving away a copy of Oracle Coherence 3.5 at each event! If you are coming please follow the links to the events above and RSVP (you need to be a member of CodeTown to sign up, but registration is free and painless.)
Coherence Key HOWTO
On occasion I am asked about best practices for creating classes to be used as keys in Coherence. This usually comes about due to unexpected behavior that can be explained by incorrect key implementations.
First and foremost, equals
and hashCode
need to be implemented correctly for any type used as a key. I won’t describe how to do this – instead I’ll defer to Josh Bloch who has written the definitive guide on this topic.
There is an additional requirement that needs to be addressed. All serializable (non transient) fields in the key class must be used in the equals
implementation. To understand this requirement, let’s explore how Coherence works behind the scenes.
First, let’s try the following experiment:
public class Key implements Serializable { public Key(int id, String zip) { m_id = id; m_zip = zip; } //... @Override public boolean equals(Object o) { // print stack trace new Throwable("equals debug").printStackTrace(); if (this == o) { return true; } if (o == null || getClass() != o.getClass()) { return false; } Key key = (Key) o; if (m_id != key.m_id) { return false; } if (m_zip != null ? !m_zip.equals(key.m_zip) : key.m_zip != null) { return false; } return true; } @Override public int hashCode() { // print stack trace new Throwable("hashCode debug").printStackTrace(); int result = m_id; result = 31 * result + (m_zip != null ? m_zip.hashCode() : 0); return result; } private int m_id; private String m_zip; } |
This key prints out stack traces in equals
and hashCode
. Now use this key with a HashMap:
public static void testKey(Map m) { Key key = new Key(1, "12345"); m.put(key, "value"); m.get(key); } //... testKey(new HashMap()); |
Output is as follows:
java.lang.Throwable: hashCode debug at oracle.coherence.idedc.Key.hashCode(Key.java:60) at java.util.HashMap.put(HashMap.java:372) at oracle.coherence.idedc.KeyTest.testKey(KeyTest.java:46) at oracle.coherence.idedc.KeyTest.testKey(KeyTest.java:52) at oracle.coherence.idedc.KeyTest.main(KeyTest.java:18) java.lang.Throwable: hashCode debug at oracle.coherence.idedc.Key.hashCode(Key.java:60) at java.util.HashMap.get(HashMap.java:300) at oracle.coherence.idedc.KeyTest.testKey(KeyTest.java:47) at oracle.coherence.idedc.KeyTest.testKey(KeyTest.java:52) at oracle.coherence.idedc.KeyTest.main(KeyTest.java:18) |
Try it again with a partitioned cache this time:
testKey(CacheFactory.getCache("dist-test")); |
Note the absence of stack traces this time. Does this mean Coherence is not using the key’s equals
and hashCode
? The short answer (for now) is yes. Here is the flow of events that occur when executing a put with a partitioned cache:
- Invoke NamedCache.put
- Key and value are serialized
- Hash is executed on serialized key to determine which partition the key belongs to
- Key and value are transferred to the storage node (likely over the network)
- Cache entry is placed into backing map in binary form
Note that objects are not deserialized before placement into the backing map – objects are stored in their serialized binary format. As a result, this means that two keys that are equal to each other in object form must be equal to each other in binary form so that the keys can be later be used to retrieve entries from the backing map. The most common way to violate this principle is to exclude non transient fields from equals
. For example:
public class BrokenKey implements Serializable { public BrokenKey(int id, String zip) { m_id = id; m_zip = zip; } //... @Override public boolean equals(Object o) { if (this == o) { return true; } if (o == null || getClass() != o.getClass()) { return false; } BrokenKey brokenKey = (BrokenKey) o; if (m_id != brokenKey.m_id) { return false; } return true; } @Override public int hashCode() { int result = m_id; result = 31 * result; return result; } } |
Note this key has two fields (id and zip) but it only uses id in the equals
/hashCode
implementation. I have the following method to test this key:
public static void testBrokenKey(Map m) { BrokenKey keyPut = new BrokenKey(1, "11111"); BrokenKey keyGet = new BrokenKey(1, "22222"); m.get(keyPut); m.put(keyPut, "value"); System.out.println(m.get(keyPut)); System.out.println(m.get(keyGet)); } |
Output using HashMap:
value value |
Output using partitioned cache:
value null |
This makes sense, since keyPut
and keyGet
will serialize to different binaries. However, things get really interesting when combining partitioned cache with a near cache. Running the example using a near cache gives the following results:
value value |
What happened? In this case, the first get resulted in a near cache miss, resulting in a read through to the backing partitioned cache. The second get resulted in a near cache hit because the object’s equals/hashCode was used (since near caches store data in object form.)
In addition to equals
/hashCode
, keep the following in mind:
- Keys should be immutable. Modifying a key while it is in a map generally isn’t a good idea, and it certainly won’t work in a distributed/partitioned cache.
- Key should be as small as possible. Many operations performed by Coherence assume that keys are very light weight (such as the key based listeners that are used for near cache invalidation.)
- Built in types (String, Integer, Long, etc) fit all of this criteria. If possible, consider using one of these existing classes.)
What I’ve learned during a winter of running
Motivated (or should I say distributed) by my lack of fitness, I decided to take on running last fall. I had done weights in the past but never running for two reasons: I used to live in Florida where it is uncomfortable to walk outside during most of the year, let alone run. I was also bored to tears on the treadmill so that didn’t go anywhere.
In Boston however I found the environment much more conducive to running, given that there are sidewalks and bike paths, not to mention it feels like half of the people here run. Also, the weather in the fall is very comfortable to run in (which probably explains why the NYC marathon is in November.)
Winter running, on the other hand, came with its own set of challenges. During those months I learned some lessons that I’m passing along.
At first I tried running with a thick cotton sweater. This turned out to be a bad idea, because the sweater (being a warm sweater as advertised) prevented heat from escaping my body. This, paradoxically, causes you to freeze your ass off since you now have a pool of sweat on your skin. The best practice is to wear material that wicks sweat away, including the top layer. Even on the coldest of days (my personal low was 15F) I would get comfortable after about a mile.
The biggest challenge for me was not the air temperature (at least after warming up), it was ice. On icy days my runs were much slower. On really icy days I would just skip the run. I know some hard core runners that would throw on spikes or yak tracks, but I didn’t take it to that level. A related challenge is when it isn’t quite cold enough to freeze deeper pools of water. I had the distinctly unpleasant experience of stepping into a 1 inch puddle on the bike path after wrongly assuming it was frozen over. That day my run was definitely shorter than it would have been otherwise.
For the given temperatures, I really was wearing very thin layers of clothing. This means that I had to keep moving in order to stay warm. I never slowed down to a walk because that would have been far more unpleasant than dealing with the fatigue. Running on the road in general has that advantage over the treadmill – with the treadmill you can quit anytime, whereas on the road you have to make it back home sometime, so you may as well run the distance.
Due to things being really busy lately my mileage has been decreasing. I’m hoping to turn this around soon, especially since the warm spring weather is bringing out so many runners.
So how has the running affected my fitness? It used to be that my heart and lungs were ready to break out of my chest after chasing down a bus. After taking up running, I was not only able to perform this exercise without loosing breath, but I was able to do it while carrying my 40 lbs son!
Springtime for Coherence
As I type this I am 35,000 feet over Alaska (no I can’t see Putin’s backyard from here) en route to QCon Tokyo. Brian was scheduled to talk but was unable to make it so I’m stepping in to present a talk on spanning multiple data grids across disparate data centers. (You can see Brian describing this talk at QCon SF here.) In addition to this presentation, I’m looking forward to sampling the food and culture in Japan. As a big city guy I’ve always wanted to visit Tokyo so I’m quite fortunate to have the chance.
As they say, when it rains it pours. I just came back from the New York SIG where I talked about real world challenges in customer deployments and passed along lessons that we learned in solving these problems. After leaving Tokyo, I will be joining Cameron and Noah in Toronto for the inaugural SIG where I will deliver the same presentation I gave in NY. Cameron will talk about the past and future of Coherence, and Noah will give an update on the latest innovations in the Coherence Incubator project.
Has everyone received their copy of the Coherence book yet? One of my favorite parts of the book is the very beginning where Aleks describes what it takes to build scalable applications. In fact a good portion of the first chapter doesn’t even mention Coherence; it just talks about aspects of scalability that developers and architects of high scale systems should be familiar with. A copy of this book was raffled at the NY SIG, and more copies will be given away at the Bay SIG and Toronto SIG.
What’s happening in the world of Coherence?
It has been a while since I’ve posted, so I figured it would be a good time to give an update on what is happening in Coherence land.
New Coherence Bloggers
We have two new bloggers sharing their experiences with Coherence!
The first is by Oracle JDBC expert Pas Apicella who recently took on Coherence. Upon his introduction to Coherence he immediately proceeded to create a CacheStore example using PL/SQL, followed by an example of using the Oracle JDBC Data Change Notification mechanism to push updates from the database to a cache.
Additionally we have Coherence architect Andy Nguyen debuting with a detailed description of a sophisticated distributed bulk loading technique he’s employed on several customer projects.
Coherence Book in March
After many months of blood, sweat, and blisters from too much typing, Aleksandar Seovic has completed the highly anticipated Coherence book published by Packt. Having worked closely with Aleks on reviews and contributions, I believe this book will be a terrific resource for developers and architects that need to write scalable applications. Both experienced users of Coherence and new users will find relevant and useful content. Aleks was recently interviewed by Cameron Purdy about the book which can be downloaded as an MP3.
User Group Meetings
The UK SIG in London is the last Coherence user group meeting for the winter, it is coming up on February 26th. The spring events are currently being planned; stay tuned for details!
Also coming up on February 24th is the first Boston SUG meeting of the year. Although the topic won’t be Coherence this time, it will be of interest for developers and architects interested in scalable systems. We’ll be meeting up for drinks and snacks at Bertucci’s afterwards. And I’ll be there if anyone wants to chat about Coherence or any other topic!