~pperalta

Migrating to my new MacBook Pro

I switched to the Mac and OS X soon after joining Tangosol. I’ve been very happy with OS X as it has the best of both worlds: the support of a mainstream OS (for peripherals such as the iPhone) along with the power of the Unix shell, and it looks good to boot. After running on a MacBook Pro for for 2 1/2 years, I’ve been fortunate enough to receive a new 15″ MBP at work. I was happy with the old MBP; the biggest (insurmountable) problem was that I had maxed out on 3GB of ram. Trying to run IDEA, Firefox, Parallels, and a bunch of other apps on 3GB doesn’t work very well.

As is always the case when getting new hardware, I had the daunting task of moving everything over to the new machine and setting it up to my liking. In an attempt to save myself a week of work, I decided to try out restoring my account and apps via Migration Assistant and Time Machine. I was a bit hesitant as I previously did an upgrade to Leopard from Tiger (instead of a fresh install, as I was then also trying to save time) but I figured that I didn’t have much to lose. If it didn’t work, I could just reinstall OS X on the new Mac and start over.

I’ve been running on the new machine for a few days now, and so far things seem to be working well. I did run into a few glitches that I was able to overcome:

Reinstalls of Parallels and the Cisco VPN client were required, I suspect this was the case because these apps require kernel extensions of some sort.
Java 6 was nowhere to be found which I thought was strange (doesn’t Java 6 ship OOTB with Leopard?) I installed the Java package from the DVD and also did a software update; this included a Java update which included Java 6.
The biggest challenge I encountered: when bringing up the screensaver options in the System Preferences app, it would spin up to 100% CPU and require a hard kill from the Activity Monitor. I noticed some messages in the Console about some deprecated API, so my first attempt to fix this issue was to follow the instructions in this thread. This did remove the warnings from the console, but the lockups continued. After reading this thread I decided to try installing iLife Support 9.0.3; this ended up fixing the problem for good.

So far I have to say that the effort has been worth it. Fixing the glitches above took far less time than reinstalling everything, and all of my preferences and settings migrated over with no problem (saved passwords via keychain, OS and network settings, etc.)

Written by Patrick Peralta

July 5th, 2009 at 6:40 pm

Posted in General

Remake of “Pelham 123″

without comments

My wife and I went to see “The Taking of Pelham 123″ today. Since I own the original on DVD I was quite interested to check out this new version, especially after having read this review by Randy Kennedy.

Like most of Mr. Scott’s movies, “Pelham 1 2 3,” which is set to open June 12, is a finely calibrated box-office machine with plenty of fireworks and star power. But for anyone who has spent time past a turnstile, one of its chief attractions will also undoubtedly be how devoted it is to the look and feel of the subway and how well it pulls it off. In early reviews since the trailer was released on the Internet, even the most bloodthirsty critics — New York City subway buffs — have had a few nice things to say.

As most fellow geeks that like to laugh and point out blatant inaccuracies on film and TV, I was interested to see for myself how well they would portray the system, especially given how well it was done in the 1974 edition. I was actually quite pleased with it; it seems that the MTA provided a lot of background to make many parts of the story quite plausible. The thing that piqued my interest was the escape scene where they ended up in the Waldorf-Astoria. It turns out that there indeed was a platform at Grand Central for this hotel. This sort of thing was not all that uncommon back then. However, the station where they filmed the escape scene was labeled “Roosevelt” which is nowhere near Manhattan. I’m thinking that it must have been in the abandoned section of the Roosevelt Ave station in Queens.

Another interesting plot is sending the runaway train to Coney Island instead of South Ferry. As Coney Island is outdoors that would make the filming more interesting; but I’m not sure how a 6 could get switched to BMT tracks to take it to Coney Island. At a minimum it would have to be switched to the 4/5 express tracks to send it over to Brooklyn in the first place. I’m thinking that the train would have derailed long before making it in any event.

The 70′s edition had more interaction between the hijackers and the passengers; there is a lot more character development in that story line. In this edition, it is clearly all about the stars Travolta and Denzel which also works well.

The intro for the 2009 version features the high pitched whine that anyone that’s ever ridden the 6 would be familiar with; I thought that was a nice touch. However, the original has the best opening credits score of any movie! I’ve embedded it below for your listening pleasure.

Written by Patrick Peralta

June 21st, 2009 at 10:06 pm

Posted in New York

Next Coherence SIG on June 24th

with one comment

The Summer edition of the Oracle Coherence NY SIG is next week. (It feels strange saying “summer edition” as we’ve had highs in the 60′s since the start of June.) It promises to be a real treat for users of Coherence*Extend. Timur Fanshteyn has experience using Extend with .NET clients and will share things that he’s done (including a LINQ interface to Coherence.) Jason Howes will cover the internals of Extend, including the internal message API, the TCP/IP infrastructure, the ProxyService, and how it all fits in with the rest of Coherence (distributed caching, remote invocation, etc.) Finally Noah Arliss will give us an update on the Coherence Incubator. He has been working alongside Brian Oliver for the past few months on new functionality so he’ll be able to provide a good perspective on the progress being made.

If you’re on Twitter, watch for the #nysig tag. And of course check out (SIG organizer) Craig Blitz’s blog for more detail.

Written by Patrick Peralta

June 17th, 2009 at 8:38 am

Posted in Development

An Introduction to Data Grids for Database Developers

with 3 comments

A little over a month ago I attended Collaborate 09 in Orlando. While having lunch one day I was lucky enough to run into The Oracle Nerd. He provided a good description of our encounter (hint: I’m the “engineer on the Coherence team” he mentions.) I first encountered his blog via this thread, which turned out to be his first exposure to data grids.

Expanding on that, we both agreed that a writeup on data grids for DBAs (or, as he prefers to be called, a [lower case] dba) would be useful. Little did he know what an awful procrastinator I am. I’m going to limit the scope of this introduction to applications that use relational databases, as opposed to addressing applications that don’t use any kind of relational database or grid-oriented apps (that will have to be a topic for another time.)

Smarter Caching

An obvious (or maybe not so obvious depending on who you ask) first step in scaling a database application is to cache as much as you can. This is fairly easy to do if you have a single app server hitting a database. It becomes more interesting however as you add more app servers to the mix. For instance:

Is it OK if the caches on your app servers are out of sync?
What happens if one of the app servers wants to update an item in the cache?
How do you minimize the number of database hits to refresh the cache?
What if you don’t have enough memory on the app server to cache everything?

This is where a data grid can come in handy. Each of these items are easily addressed by Coherence:

The view of a Coherence cache will always be consistent across all nodes. If an app server updates the cache, all nodes will have instant access to that data. In fact, servers can register to receive notifications when data does change.

Most caching in app servers uses a pattern we call “cache aside,” meaning that it is up to the application to

check to see if the data is cached
if not, then load it from the database, place it into the cache, and return the item to the caller

A better approach is to use the “read through” pattern, meaning that it is up to the cache to load the data from the database upon a cache miss. The benefits to this approach are

application code is much simpler; it assumes that all data can be read through the cache API
if multiple threads (in a single JVM or across multiple JVMs) access the same item that is not in the cache, a single thread will read through to the database to load that item

This is a big win for the database; instead of answering the same question repeatedly, the database can answer the question once and all app servers benefit.

If expiry is desired for a cache, Coherence can be configured to perform “refresh ahead” on a cache. For example, if a cache is configured to expire after 5 minutes, you can configure a refresh ahead value (say 3 minutes) to determine when the value will be reloaded from the database. If an item is >3 minutes old (but not yet expired) and a thread requests that value, the currently cached value will be returned, and the value will be asynchronously refreshed in the background.

If more storage capacity is required, all you have to do is add servers. Generally there is no extra configuration required.

All in all, it provides a very sophisticated cache that can drastically reduce the number of SELECTs issued against a database.

Scaling Writes

This next data grid feature may be a bit more controversial for database developers. In the previous example, we can assume that the database is still the system of record (a.k.a the source of Truth.) For situations where we always want the database to hold the Truth, data grids can have caches configured to use the “write through” topology. This means that updates made to the cache will be synchronously written to the database, just like any other database app. However, if you have dozens of app servers with dozens of threads each writing to the database simultaneously, scalability will definitely be a concern. In this case, the cache can be configured to write the updates to the database asynchronously; this is known as “write behind” topology. Here are some of the objections that I’ve heard (and my response):

What happens if I lose a server? Coherence maintains a backup of each entry in the cache, and it keeps track of items that have not been flushed to the database. If a JVM or a machine is lost, the data (and the fact that it still needs to be flushed) will not be lost.
Are there any ordering guarantees? What about referential integrity? Let’s say you had a cache for Person and a cache for Address. If Address has a foreign key dependency on Person, then write behind is not a good fit. There is no guarantee that Person will make it out to the database before Address. In this case you’d have to combine the two into an object, and the write behind routine would know to insert Person before Address.

This may change someday, but the reality of write-behind today is that the database write should never fail (short of the database itself failing, in which case the item can simply be requeued until the database comes back up.) Read only caches can generally be easily retrofitted into an existing application; the same is not always true for write behind.

However, it is a very powerful apparatus in the data grid toolbox to help scale database applications.

Transient Data That You Don’t Want To Lose

Sometimes an application has transient data that does not need to be stored in a database, but it ends up being stored in the database anyway (because you don’t want to lose it.) A good example of this is HTTP sessions. You don’t want to lose a session if a web server goes down, but the database just seems like overkill for this. The scalability of the application will be limited, not to mention the scripts that will have to be written to clear out expired sessions.

For the specific case of HTTP sessions, Coherence*Web provides a solution to store sessions in Coherence caches. This is an OOTB solution; it will work with just about any J2EE compliant web application. It also works across many popular web containers, including open source and proprietary (Oracle and non Oracle) containers.

Hopefully this is a good broad introduction to data grids for database developers. Comments or follow up questions are welcome.

Written by Patrick Peralta

June 13th, 2009 at 11:16 pm

Posted in Coherence,Development

Two New Coherence Blogs

with one comment

I’m pleased to announce two new Coherence related blogs:

First is the blog of Mark Falco, who is responsible for (among many things) our new C++ client and a bunch of the plumbing behind Coherence (including TCMP.)

Next is Aleksandar Seovic, whose claim to fame includes contributions to Spring .NET and the implementation of POF in .NET.

Be sure to add these to your RSS feed; I’m looking forward to even more great Coherence related content!

Written by Patrick Peralta

May 27th, 2009 at 8:55 am

Posted in Development

~pperalta

Migrating to my new MacBook Pro

Remake of “Pelham 123″

Next Coherence SIG on June 24th

An Introduction to Data Grids for Database Developers

Smarter Caching

Scaling Writes

Transient Data That You Don’t Want To Lose

Two New Coherence Blogs

~pperalta

Recent Posts

Meta