Saturday, 28 July 2007

Good frameworks

    You know a framework is good when you you can't see it's use in your code.
    I just finished a little data transfer utility using Apache Camel for the transport mechanism, Spring for the plumbing and then Terracotta to cluster the work on the server.
    I am using Stax on the client side to parse a large XML document and send item by item over to the server. On the server I then use the Master/Worker pattern to distribute the received items. Terracotta transparently distributes the work items for me onto several JVMs.
    When an item is queued the server sends feedback to the client , so the client can throttle the item upload to make sure neither client or server will have ever to much data to keep in memory.
    I was very impressed with Terracotta. Once I had the framework ready it took me only about an hour to put the Terracotta configuration in to cluster the JVM. The only reason it took so long was because it was my first time using it. There is as well a Terracotta Eclipse plugin which allows you to run the distributed app within in the IDE. It even tells you about issues and offers to correct the configuration for you.
    Back to the starting comment, the nicest thing is I did not have to code to either Spring, Camel or Terracotta but all of these frameworks fitted in with my application!
    For more info on Terracotta and the Master/Worker pattern read: How to Build a POJO-based Data Grid using Open Terracotta
    I will finish the post with a quote from Paul Arden I found on Jonas's blog which fits well into the main theme of my blog:
    "If you can't solve a problem, it's because you are playing by the rules."


    Joao Cerdeira said...


    you tell more about your application ?

    I'm working on clustering one solution close to yours, clustering ActiveMQ. What is the transport layer that you use in Camel ?

    Very nice Post :)

    João Cerdeira

    Ingo said...

    João, thanks for your nice comment!
    I currently use TCP as the transport layer. However I placed Camel into the mix to be able to change this easily. BTW. Camel is a subproject of ActiveMQ and works very nicely with it (probably it's preferred mechanism!)
    Regarding the application, there are currently a couple of applications I'm involved in for which this approach will fit.
    This load mechanism could be integrated into the main application but is designed so that it can be deployed side by side to allow different uptimes and maintenance cycles.
    Basically we have an external application creating a set of data to be imported (or integrated - there's some more logic involved then purely saving it) into our application. They will then call an upload application sitting on their side of the fence, written by us, to send the data to our application.
    The bottleneck will be the actual business logic and persistence of the messages.
    As a result I implemented a single threaded sender and receiver which place the data into a queue from which the multi threaded workers will pick the messages up and digest them.
    I based the master/worker implementation on Jonas Boner's Datagrid example. I did however change several classes and implemented a 'throttled' version so that never too many messages are simultaneously held in any layer.
    On the server I changed the work queue to use a bounded queue. I also changed the workers to use the ExecutorCompletionService so I can listen to messages being completed and throttle the input placed into the Thread pool accordingly.
    To throttle the sender I return confirmation messages from the receiver whenever a message is placed into the queue. This way the receiver knows when to parse and send the next piece of data to keep the workqueue on the server full.
    Terracotta will cluster the application from the workqueue onwards. The 'worker' applications I start up will have no own receivers. All they do is pick pieces up from the queue and digest them. As the implementation of the logic is implemented inside the work classes the workers are generic enough to be able to pick up and process work items from different use cases, e.g. you could cluster other areas of you application without having to change the workers.

    dan said...

    Hi ingo, it looks like your link to apache camel doesn't work and the link to terracotta points to the spring website.

    However, I couldn't agree more about the value of frameworks that don't get in the way. It's a juggling act keeping focused on the task at hand while adding frameworks to the code-base in hope of a substantial pay-off. It's great that you managed to stitch three of them together that play nice.

    Ingo said...

    Hi Dan
    Thanks for your comment. The links should be working now.

    Anonymous said...




    情趣 來看看