Archive for the ‘Development’ Category
What’s happening in the world of Coherence?
It has been a while since I’ve posted, so I figured it would be a good time to give an update on what is happening in Coherence land.
New Coherence Bloggers
We have two new bloggers sharing their experiences with Coherence!
The first is by Oracle JDBC expert Pas Apicella who recently took on Coherence. Upon his introduction to Coherence he immediately proceeded to create a CacheStore example using PL/SQL, followed by an example of using the Oracle JDBC Data Change Notification mechanism to push updates from the database to a cache.
Additionally we have Coherence architect Andy Nguyen debuting with a detailed description of a sophisticated distributed bulk loading technique he’s employed on several customer projects.
Coherence Book in March
After many months of blood, sweat, and blisters from too much typing, Aleksandar Seovic has completed the highly anticipated Coherence book published by Packt. Having worked closely with Aleks on reviews and contributions, I believe this book will be a terrific resource for developers and architects that need to write scalable applications. Both experienced users of Coherence and new users will find relevant and useful content. Aleks was recently interviewed by Cameron Purdy about the book which can be downloaded as an MP3.
User Group Meetings
The UK SIG in London is the last Coherence user group meeting for the winter, it is coming up on February 26th. The spring events are currently being planned; stay tuned for details!
Also coming up on February 24th is the first Boston SUG meeting of the year. Although the topic won’t be Coherence this time, it will be of interest for developers and architects interested in scalable systems. We’ll be meeting up for drinks and snacks at Bertucci’s afterwards. And I’ll be there if anyone wants to chat about Coherence or any other topic!
Oracle Open World: The most Coherence content under one roof!
This year I’ll be at Oracle Open World for the first time. I hear that this conference dwarfs JavaOne in size which is hard to imagine given how large JavaOne is. For those of you attending that are interested in Coherence, we have over two dozen sessions to choose from – making this the biggest Coherence event ever.
The Coherence content is available on OTN. (There’s also a PDF version that is nicer for printing out.) It includes sessions on future direction, integration with other products in the Oracle lineup, developer workshops, and customer panels describing their use of Coherence.
I’ll be talking about using Coherence to scale out external data stores (including relational databases.) This is mostly the same content that will be covered during the NY SIG.
Next NY Coherence SIG on October 1st
The next NY Coherence SIG is on October 1st (two weeks from today) and it promises to be a great event. For those of you who follow my blog, I previously introduced these two gentlemen a few months ago when they started blogging.
The first talk will be by Aleksandar Seovic who has an extensive Coherence resume. In addition to implementing POF in .NET, he is also the author of an upcoming book on Coherence. In his spare time (insert tongue in cheek) he runs S4HC, a consulting company specializing in Spring, Coherence, and other technologies.
If you’re in Tampa, you can also check out his upcoming talk at Tampa JUG on September 29th.
We will also have Mark Falco, one of our rock star engineers who concentrates on our network protocol (TCMP), C++, and other areas. Mark is usually the first (and last) person I reach out to when I have questions about Coherence networking – so if you have any questions of your own be sure to bring them. He will talk to us about TCMP and how to optimize your machines and network for optimum performance.
Finally, yours truly will talk about configuring Coherence to work with an external data source (usually relational databases.) I’ll describe in detail how each of the external connectivity features work (including many features you’ve probably never heard of), best practices, and good old fashioned war stories. (Shout out to Rob for helping with the Omni Graffle diagrams!) This is the same talk that I will present in mid October at Oracle Open World in San Francisco. I’ll provide more detail on this later; for now you can check out the Application Grid lineup – which includes WebLogic Server, Coherence, JRockit, Tuxedo and Enterprise Manager.
NFJS Boston Day 3
Coverage of the last day of No Fluff Just Stuff (albeit a few days late):
Spring DM and OSGi
Craig Walls provided an extensive overview (and defense) of OSGi. OSGi is a framework for managing library dependencies. It enables the installation, configuration, and updating of modules in a live program without JVM restarts. Multiple versions of a class and/or libraries can coexist in a container. Each module has a defined lifecycle and dependencies between modules can be defined.
Here is a list of OSGi implementations:
Open Source
Commercial
- Makewave Knopflerfish Pro
- ProSyst mBedded Server
- Samsung OSGi R4 Solution
- HitachiSoft SuperJ Engine Framework
Craig provided a live demonstration of the OPS4J Pax shell running under Eclipse Equinox and Apache Felix. Unfortunately we ran out of time and we didn’t get to see much of the Spring DM server in action.
java.next
Stuart Halloway gave an overview of the latest and greatest languages available for the JVM. Like some of the other NFJS speakers he has very strong opinions, especially when it comes to the use of Java. I believe the quote was something along the lines of “every time you start a greenfield project with Java, God kills a kitten.” (Incidentally, Ted Neward believes that using Java arrays instead of collections will lead to the same fate for said kitten.)
Straight from Stuart’s slides: pros and cons for your consideration:
Clojure Pros
- Functional
- Multimethods
- Concurrency
- Lisp
- A la carte
Clojure Cons
- Youngest java.next language
Groovy Pros
- Easiest to learn
- Easiest bi-di interop
- More committed to reusing Java libs
Groovy Cons
- Worst Java baggage
- No concurrency/multicore story
JRuby Pros
- Biggest community
- Commercial support: EngineYard
- Rails
- multiple platforms
JRuby Cons
- No concurrency/multicore story
Scala Pros
- Functional
- High performance
- Pattern matching
- Actor model
- Hybrid object/functional (could also be a con)
Scala Cons
- Hardest to learn
The general theme on these new generation languages is:
- Dynamic typing (Scala is not as dynamic as the others, but offers more flexibility than Java)
- No checked exceptions
- Reasonable defaults
- Convention over configuration
- YAGNI
NFJS Boston Day 2
Here are my highlights of No Fluff Just Stuff Day 2:
Garbage Collector Friendly Programming
Garbage collection is an interesting topic to me for several reasons, the main reason being that poor GC performance is very harmful in distributed environments. When you have a peer to peer system such as Coherence, any one node can directly communicate with any other node at any point in time to service a request. If a node that needs to service several requests is in a long GC pause, it isn’t just that node that is affected. Every JVM that is waiting for a response from that node also experiences high latency, thus causing a cascading effect. (More on what to do about this later.)
Brian Goetz described the evolution of GC in Java, starting with the single threaded mark and sweep algorithms up to the modern generational collector. What they each have in common is the tracking of allocation roots (which include static variables and local variables allocated on thread stacks) and the traversal of object references starting at these roots. The implementation of the generational collector is (roughly) as follows:
- New objects are allocated in the young generation space
- When a minor GC occurs, objects that have references pointing to them are copied to the survival space. The remaining objects are removed
- Eventually objects that live long enough in the survival space are moved to the old generation.
- If a minor collection fails to free up enough space in the young generation area, a full collection (which is much more expensive) will occur in the old generation space.
An implementation detail is that the JVM must track references from the old generation to the young generation in order to know which GC roots to traverse when performing a minor collection. This means that the more “old” objects there are pointing to “new” objects, the more work the collector has to do. In a practical sense, this means that allocating new objects is preferred to reference field updates.
It took a while to wrap my head around that last statement, so let me attempt to demonstrate. Let’s say I have a map that contains objects with many fields. If I wanted to update some of those fields I could do it like this:
Map map = ... MyObject o = map.get(key); o.setField1(f1); o.setField3(f3); o.setField5(f5); |
Or I could do it like this:
Map map = ... MyObject oOld = map.get(key); MyObject oNew = new MyObject(f1, o.getField2(), f3, o.getField4(), f5); map.put(key, oNew); |
The first example updates three reference fields (assuming the object in the map is in the old generation), whereas the second is updating one (the reference held by Map.Entry) – and as an added bonus the second implementation is thread safe (assuming that MyObject is immutable and the Map implementation is also thread safe.) If anyone has a better (or more correct) example of this concept, I’d be happy to see it!
Many other concepts were covered including: why you should use finally instead of finalizers to clean up, weak references, soft references, and tracking down memory leaks. Capturing heap dumps to track memory usage is a technique I recommend to customers (this works much better than speculating/wild guesses about where unexpected memory allocation is coming from) – I especially recommend configuring the JVM to generate a heap dump when an OutOfMemoryError is thrown. My favorite tool to read heap dumps is Eclipse MAT. Heap histograms are also a nice light weight approach to analyze memory problems.
When customers ask for suggestions on GC tuning, my recommendation is to keep it simple: fixed size 1 GB heaps (on the Sun VM), and don’t use more than 75% of the heap. I usually don’t recommend any specific tuning parameters, as the GC algorithms are constantly improving and any exotic flags that may have worked in older JVMs (assuming they helped in the first place) may not work so well in newer JVMs. The best advice I can give is to not fill up the heap as this will cause more frequent full collections.
Inside the Modern JVM
NFJS tends to cover languages in the JVM other than Java (such as Groovy, Scala, JRuby, etc.) This is a testament to the strength and viability of the modern JVM. Brian covered some of the advancements and (quite frankly) rocket science that goes into the JVM, HotSpot in particular. No matter what happens to Java (which isn’t going away any time soon), the JVM will be around for a very long time.
In a nutshell: why is Java, a supposedly interpreted language, faster than C++ in many cases? The answer is that the JVM determines which optimizations to make at run time instead of compile time, which is the opposite approach of C++ and other native languages. Optimization at runtime is far more effective, since the JVM has hard statistics of real world usage to draw on as opposed to the speculation and guessing that happens when everything has to be compiled to machine code before execution.
The overall theme of this talk (and the previous ones) is to write simple clean code – the runtime recognizes common usage patterns in Java and is built to optimize these patterns.
Java Collections
Ted Neward gave an engaging and entertaining talk on the Java Collections API. To be honest I was familiar with most of the material, but he is a fun speaker to watch, in spite of the fact that he gave me a good ribbing for showing up to his talk after it had already started! He is quite biased against arrays and towards collections, which made me think back to a web/remote services API I designed a few years ago. I exclusively used arrays as the collection type for this API for two reasons:
- To make SOAP/cross language interoperability much simpler (least common denominator – every language does arrays)
- There were no generics at the time; using arrays instead of generics in the interfaces meant that I could explicitly define the type of the array
The second item is not as important anymore now that we have generics, so I’m inclined to agree that arrays should be used sparingly nowadays.
Another interesting point is that iteration over collections should be done using a functor instead of a plain iterator. For example:
List<Name> names = ...; MyListOps.apply(new MyApplyFn<Name>() { public void apply(Name n) { // use n } }, names); |
This allows the possibility of processing the collection in multiple threads.
What’s coming in Java 7
This was Ted Neward’s next talk which was just as interesting and opinionated. Here are the highlights:
- The release is targeted for early 2010
- There is no official JSR for Java 7
- Most of the information on what is going into Java 7 can be found on the blog of Joe Darcy of Sun.
- Alex Miller has a huge page on his blog detailing what is in and what is out. This is information that is is aggregating off the web.
One of the most compelling additions to Java 7 is JSR 292, which introduces the bytecode invokedynamic. The implications of this addition are narrated by Charles Nutter of JRuby. There are other syntactic conveniences making it in; however it will not include closures (a fairly controversial topic.)