Saturday 28 July 2007

Good frameworks

    You know a framework is good when you you can't see it's use in your code.
    I just finished a little data transfer utility using Apache Camel for the transport mechanism, Spring for the plumbing and then Terracotta to cluster the work on the server.
    I am using Stax on the client side to parse a large XML document and send item by item over to the server. On the server I then use the Master/Worker pattern to distribute the received items. Terracotta transparently distributes the work items for me onto several JVMs.
    When an item is queued the server sends feedback to the client , so the client can throttle the item upload to make sure neither client or server will have ever to much data to keep in memory.
    I was very impressed with Terracotta. Once I had the framework ready it took me only about an hour to put the Terracotta configuration in to cluster the JVM. The only reason it took so long was because it was my first time using it. There is as well a Terracotta Eclipse plugin which allows you to run the distributed app within in the IDE. It even tells you about issues and offers to correct the configuration for you.
    Back to the starting comment, the nicest thing is I did not have to code to either Spring, Camel or Terracotta but all of these frameworks fitted in with my application!
    For more info on Terracotta and the Master/Worker pattern read: How to Build a POJO-based Data Grid using Open Terracotta
    I will finish the post with a quote from Paul Arden I found on Jonas's blog which fits well into the main theme of my blog:
    "If you can't solve a problem, it's because you are playing by the rules."

    Saturday 21 July 2007

    Lego, Reusability and components

      One of the biggest holy grails of software development is the attempt to create reusable components/applications.
      The open source industry has done very well achieving this with infrastructural frameworks such as Spring and Hibernate.
      However there are not many success stories when we look at business applications and most companies end up with a series of branches of their software for different clients.
      These are normally created by taking the closest implementation of an applications in respect to the new customer's requirements and branching it of.
      After not very long the two branches will be too different to each other to even attempt to maintain them as one product. Any good intentions the company had at the start of the project to merge the two branches back together after the new client went live are generally not only quickly forgotten but not even mentioned ever again.
      Next time then the company does a rewrite of this business application or creates a brand new product one of the first requirements to surface will be to make this particular application reusable (same as it was intended last time around).
      The general approach to this is that a set of business analyst will try to analyse any requirements that their existing clients or future clients has or is likely to have in the future and to create a design that will contain all of these.
      If you are lucky this process might even entail looking at the existing applications and to try to find out what went wrong. This is however the exception rather than the norm as such an exercise holds the risk to embarrass the people who were in charge of the last failure to achieve it and who are likely to be now the people who would be in charge of instigating such an exercise ...
      So far so good and depending on the experience of the analysts we might end up with a decent design. It's from here on that things will start to go downhill:
      First of all time lines will get tight. As the pressure of the delivery of the first implementation to the first customer is growing the tendency to create compromises (obviously only temporary compromises ...) grows as well. These compromises will be twofold:
      1. Functionality that is not initially required or was just included to please potential future customers will be 'postponed'.
      2. Mechanisms intended to make future enhancements easier will be replaced by 'temporary' short cuts
      All going reasonably well we will now go live with our first customer. At the same time the sales team will have, hopefully(!?) , identified the next clients for the application.
      The new customer's requirements will not only include some of the functionality we dropped but most likely will also entail new functionality no one had thought of in the initial analysis phase.
      To make things worse the initial customer, having gone live now, will discover all the bugs which were missed during the acceptance phase and the will discover another set of functionality required that crystallised as they started to use the application in earnest.
      These bugs and enhancements will be very urgent and the team is likely to chase it's own tail to achieve them. One thing the team will most likely not want to do, is risk any stability there is by introducing the set of requirements for the new client at the same time. So we branch ...

      So what are the lessons:
      • We will never analyse all future requirements upfront
      • We most likely will only implement the functionality really required
      • We are unlikely to risk a live implementation to add functionality required by another customer
      Now let's look at the most successful component set available: Lego
      One of the reasons it is as successful as it is because the individual building blocks are very small and generic. You do not have a single building block that resembles a whole house or car but many small ones with which you can build either or something totally different.
      This does not mean however that you can't reuse the blueprints of the houses you have built. You can now rapidly rebuild any of these if someone else wants the same house (the themed Lego kit). But even when you do this you are able to change whichever detail you want for this new house. Most of these kits have instructions for several designs included.
      The other feature making Lego as reusable as it is, comes from it's composition model.
      You will not need any glue or cement to build your structures. At the same time the resulting structure will look as if is was one rather than many pieces. You can take any of the structures however and combine those seamlessly again.
      Lastly, due to the lack of the glue you are also able to remove some of the blocks again at a later stage and replace them with some others or reuse them for a different construct.
      I know we are not in the business of supplying toys.
      The lesson to learn however is that as long as we try to create large reusable blocks which will fit all we will fail.
      And if we do not create a mechanism for our blocks to seamlessly integrate whilst allowing for these integrations to be changeable we will also fail.
      I will try to detail in my comming blogs some of the mechanisms which we can use to build our applications like Lego constructs, e.g. how we can use small generic building blocks, combine them seamlessly and create large applications out of it.

      Tuesday 17 July 2007

      Flame wars and NLP


        I'm constantly amazed at the flame wars fought in the forums:
        Rest vs Soap
        XML vs Json
        Java vs .net
        ...
        Richard Bandler and John Grinder's books on NLP have a very refreshing approach to such subjects in respect to communications:
        1) Communication is not about the content of what you say but about the response you elicit.
        2) If something doesn't work - do something else

        I think these principles are the same for programming:
        1) It's not about using the correct approach it's about getting the correct result
        2) If a framework, technology does not solve your problem - choose another one

        For most of the above flame wars I can come up with several use cases for either option. This means a decision on a 'correct' approach can only be made in the context of a particular use-case.

        Monday 16 July 2007

        Checked or UnChecked Excpetions - a compromise?

          The general thought process in the above question is around the ability of the calling code to handle the reason for the exception (Update: Maybe not so general after all, see: internal-and-external-exceptions).
          If we can, the exception should be checked, otherwise not.
          This leaves still the bubbling issue. Let's say I catch a StaleObject exception in my data access code. I can deal with this - I tell the user someone updated the item and ask him to amend the latest version instead.
          However, likelihood is that there are a lot of function calls and hence catch blocks between the code offering the solution to the user and the data access code discovering the problem.
          I therefore introduced the ExpectedRuntimeException into my application. This allows me to ignore these kind of Exceptions up to the Service level. Here I can now easily deal with these in a different way to any other RunTimeException

          Saturday 14 July 2007

          POO or reasons for an Anemic Domain Model


            For my first post I want to question one of the most prominent Goolden Hammers and put an anti pattern back into my toolbox.
            Object Orientated Design has been for a long time accepted as the 'correct' approach to software development.
            There are very different approaches available and especially functional programming has been getting some prominence recently.
            For this post however I won't go so far as to question the whole approach but will attempt to put one of the 'anti-patterns', the Anemic Domain Model, back into the toolbox as one of the right tools for the right job.
            The term Anemic Domain Model was coined and made an anti pattern by Martin Fowler in 2003 ( AnemicDomainModel).
            As I don't really want to talk about an anti pattern and as any modern approach should be represented by a three letter acronym I will use a far more positive title for it here and call it POO (Process Orientated Objects).
            The idea here is that the domain objects are demoted to pure state-holders which have no behaviour themselves. Behaviour and business rules are extracted into processes. I call these processes rather than services as the term service implies a coarse grained and exposed structure. A service might therefore utilise a series of processes, which themselves might utilise child processes.
            The processes hence become the puppet players and the domain objects are the puppets.
            Let's see why I would want to do so.
            My role in my company is to create and maintain software products for our customers.
            As customers do, our customers would like to have their cake and eat it. They expect to receive established products. At the same time they want these products to follow their specific processes and tie in with their existing infrastructure and legacy applications.
            My employer is not very different. The company wants to have a single maintenance and hence source branch for a product whilst not having to say no to any specific wishes of a client.
            This would all still be fine if we would talk about simple extensions to existing functionality or the odd custom field.
            We are writing however financial software for large international finance houses. Although from a high level these all want the same functionality (let's say a loan administration system), once you zoom into their precise requirements , no two companies want the same business rules.
            Worse, even the same company is likely to have disparate rules for the individual sectors they are working in.
            Let's look at a simple loan contract. Depending on client and sector the differences can be described in two main directions:
            • The details captured
              If it's a mortgage I want to record the property details, for a car loan the details of the vehicle and for a personal loan I have no collateral at all. Depending on the sector I might or might not have to care about VAT (Sales tax).
            • The functionality
              Interest might be calculated daily, monthly or yearly. I might need compound interest and I might or might not charge interest on outstanding fees.
            OO gives us a method to deal with these variances: Inheritance.
            However inheritance is one directional (at least in the world I live in, e.g. Java). If I have behaviour and state encapsulated in the same object than I can only extend both at the same time.
            As a result I quickly start to branch and reimplement similar functionality twice:


            If I split behaviour out of the domain objects on the other hand I can have these two concerns evolve separately:

            The usefulness of this approach increases as we add more functionality. Each of these functions will live in their own process hierarchy which allows any combination of methods to be applied without the need to branch.

            I would like to finish of by dealing with the reasons why the POO approach is poo poo'ed. Looking at Martin Fowler's original blog I can however see only one concrete critique that does not just state that it isn't OO as it was intended:
            "In essence the problem with anemic domain models is that they incur all of the costs of a domain model, without yielding any of the benefits. The primary cost is the awkwardness of mapping to a database, which typically results in a whole layer of O/R mapping."
            This might have been a real issue in 2003. Transparent persistence and decent ORM tools are the norm now with technologies such as Hibernate and JPA.
            I actually use AndroMDA to create my domain objects and mappings for me from my UML diagrams which makes the whole process very easy and quick.
            So I shall reformulate this to:
            In essence the benefit of POO is that I get a lot of the benefits of the OO aproach, the benefits of ORM tools and transparent persistence, whilst keeping the concerns of state and behaviour logically separated.

            Final Disclaimer:
            If anyone reads this blog as an argument that POO is the correct approach and that real OO is not, they have misunderstood me. Both are valid approaches and the skill is to decide which approach fits a particular requirement.
            I use POO to create myself a construction kit from which to build specific implementations. If you do not have such a desire/requirement, proper OO will be the neater and better approach.




            Hi Blog

            Considering the time I spend on a regular basis reading through other people's blogs (mostly IT - I am that sad) I decided it's time to join this sphere.
            The name of this blog comes from my deep distrust of 'Golden Hammer' approaches.
            These golden hammers approaches are not an invention of the IT industry but are a very common pest in it!
            The 'Golden Toolbox' is therefore a collection of several potent approaches where we can select the best fit for a particular problem.