Friday, 4 September 2015

Graph database : Plot yourself to supremacy

I would like to start this blog by wishing ‘Happy Janmashtami’ to all the readers (if there are any). I hoped this blog would be Part Deux of what we discussed in previous post. But, in the meantime, I got the chance to exercise my hands on something new and exciting and hence, thought of sharing it. Don’t worry, we will do part deux some other time.

This blog briefly discusses a new database, called ‘Graph Database’. Before we dig into it, let’s revisit the history of databases.

It all started with storing the data in punched cards and flat files, offering several access methods with limitations. We then gradually evolved to mainframes of IBM (from their Information Management System). Sometime in 1970s, Larry Ellison read a paper published by IBM on some database programming language and that resulted in revolutionary database management system called RDBMS.. Fast forward to 2000s and people started experiencing difficulties in maintaining large sets of data as well as designing schemas to support modern web application structures, resulting in new database technologies like MongoDB and Graph database.

Have you ever come across any requirement where you had to store hierarchical structures (e.g. application menus, organization structures or parent-child relationships)? I bet most of you have. In such situations, (because our brains are trained to think like SQL schemas only), we generally go by either of these approaches (let’s say we want to persist a tree structure with parent-child relationship):
  • Create self referencing foreign key on the same table, called as parent id and fetch the records by parent id.
  • Create one to many relationship between 2 tables with second table having mappings of parent id and child ids.
To be fair, any of these would work fine. Oracle even provides hierarchicalquery support to retrieve the whole structure in a single query and I am sure we can write loops/queries for other databases as well.

However, how amazing it would be if you don’t need to worry about anything at all! You will be like, “Here’s my list/set/map of objects, each having another set of children, and another set of grand children and so on... Go and persist the whole list.” ... “Done? Now, fetch me sub tree of child-2 of parent-3.” And boom! Your sub-tree is ready. It’s cool, isn’t it?

Graph database does exactly that. It first persists the graph and then lets you query by node name, relation-type, distance from root and whatever else you can think of. There are plenty of graph database projects (here is the full list), each having its own set of features. Among these, I have chosen neo4j to discuss as it is commercial, easy to implement and has spring data support. Let’s discuss an example of the same.

I have used organisational structure as an example. Let’s say an organisation has a structure like this: Director -> Manager -> Leader -> Member. The structure has 4 hierarchies (all being one to many i.e. a director has more than one manager under him, a manager has more than one leader under him and so on). Have a look at the model classes (code is uploaded here). Each has a list of objects called teamMembers. You will see some annotations on class members; we are not going to discuss those (documentation here) as most of them are self explanatory.

I have put 2 Unit tests under test folder, one to populate the graph and other to retrieve the data. Have a look at ‘PopulateEmployeeStructureIntegrationTest.java’, it creates the structure and calls save with director object. The point to note here is, calling save on director object alone creates all sub nodes and builds the graph automatically (magic!). This creates 2 directors; having 2 managers, 2 leaders and 2 members each.

Now, it’s time to query the data. Have a look at ‘EmployeeStructureQueryTest.java’, it calls findByName to find an employee (findByxxx calls are features of spring data, if you are not aware about spring data then read this). We are passing director name as an employee name and voila, it brings up the whole graph! Imagine achieving the same with series of SQL queries and setting references from results, nightmare..

Before running this example (or, to make this example run successfully), go through below steps:
  • Install and run neo4j (download community version from here).
  • Add ‘neo4j-cypher-dsl-1.8.jar’ (get it from here) to your local maven repository. It is not present in mvn central and hence, may create a problem.
Once neo4j is installed and population script is run, neo4j ui will show tabular structure as well as semi interactive graphical representation of all the nodes. We can even query the graph using neo4j’s cipher query language (examples here).

That’s it for today. For those wondering about commercial usage of graph database, here is the bonus reading. Hope you enjoy playing around with graph database as much as I did. If something doesn’t work, let me know in comments and I will get back to you. Till then..

Friday, 14 August 2015

Coding techniques : Part Un

I recently read that “A good programmer can be as 10X times more productive than a mediocre one”, I don’t really know how true it is; neither do I know how they arrived on 10X multiplier. Anyway, we are not here to discuss who is a good programmer and who isn’t. We are here to discuss some tips and tricks on java programming which can be helpful for many of us in our daily life (finally, I am doing justice to the blog title ‘Programming Paradigms’).
I don’t really know why I wrote the above paragraph, guess I just wanted to write something for introduction. Anyway, let’s discuss something which we are here to discuss.


Following are the examples of how a minor precaution/carelessness can save/kill you. I bet we see such code (mentioned in the below examples) at least once a day. We generally ignore it because a) it is not something which I wrote or b) the system works fine, why should I change it. My advice here would be not to ignore such instances as fixing those would take just 5 minutes compared to 5 hours when it is raised as a production bug, comes back to bite you in the arse and you have to deliver a hotfix on Saturday morning. Let’s start with the examples of do’s and don’ts now.

  1. Avoid repetitive logging of same exception:
    I am sure you must have come across code like this:

    public void method1(){
    try{

             method2();
         }catch(Exception e){
             //Handle exception
         }
    }

    public void method2() throws Exception{
         try{
             method3();
         }catch(Exception e){
             log.error("Error in method 2", e);
             throw e;
         }
    }

    public void method3() throws Exception{
          try{
              //Some code
          }catch(Exception e){
              log.error("Error in method 3", e);
              throw e;
          }
    }

    Although this code may just work fine, it’s not something which you will like. E.g. if any exception is thrown from method 3, it is logged twice before reaching to actual handler code (which may log it one more time to complete the hat trick), filling the log files with unnecessary stack traces and making debugging difficult.

    Best practice here would be to remove try-catch blocks from method2 and method3 (they can have try-finally pair if they are acquiring/releasing resources) and place logging and exception handling in method1 only.

  2. Don’t swallow exceptions in catch:
    This scenario is a variant of what we just discussed. Have a look at the below code:

    public void method2(){
    try{

             String result = method3();
             //do further processing
         }catch(Exception e){
             //Handler code
         }
    }

    public void method3(){
         String result = null;
         try{
             //Some code
         }catch(Exception e){
             log.error("Error in method 3", e);
         }
         return result;
    }

    Here, method3() catches the exception and just logs it. Method2() has no idea of what happened in method3() and will continue its normal flow, causing a NullPointerException itself.

    Again, best practice here for method3() would be to avoid dealing with exception and let it propagate to the caller.

  3. For String equality check (i.e. using equals() and equalsIgnoreCase()), always keep the constant value on the left hand side:
    Consider the below scenario:

    public void process(){
    String success = "Success";

         String failure = "failure";
         String status = getStatus();
         if(success.equalsIgnoreCase(status)){
               //...
         }else if(status.equalsIgnoreCase(failure)){//Don't do this
               //...
         }
    }

    Both the comparisons would work perfectly fine as long as getStatus() returns a not null status. However, we shouldn’t rely on it returning a not null status every time (what if it is designed like method3() of our example 2? You never know!). As both a.equals(null) and a.equalsIgnoreCase(null) return false for every String a, it is always best to use constants on the left hand side.

  4. Don’t return null collection from any method:
    It is always best to return an empty collection rather than null while writing any method which returns a collection. While developing a method, we don’t know who else will be using that method, they may be calling Collection.size() on returned object. Consider the example below:

    public List<String> getNames(){
    List<String> names = new ArrayList<String>();

         try{
    //...               

         }catch(Exception e){
              names = null;//Don't do this
         }
         return names;
    }

    public void processNames(){
    List<String> names = getNames();
    log.debug("Found " + names.size() + " names to process");//Boom!
    }

    As we can see, processNames() always assumes that the list will never be null. Instead of forcing null checks for all the caller methods (after they see NullPointerException in production logs), it is better to return an empty list beforehand.

  5. Don’t reinvent the wheel, use open source libraries:
    Continuing from point 4, let’s say you are calling such method (which return a collection) at multiple places and you are not sure whether it will always be not null. In this case, to compute the size (or to treat a null list as an empty list), you will end up writing something like this:

    public int getSize(List<String> list){
    if(null == list){
          return 0;
    }else{
          return list.size();
    }
    }

    A little bit of google before writing should say that there’s already a library (apache commons) which contains hundreds of such methods to handle null and empty collection (more on that here). With the help of this, the above method will be transformed into following:

    CollectionUtils.size(list);

    This is just an example. There are plenty of such libraries and methods to ease the coding effort. I would say that one should always google at least once before writing any utility as there are high chances that the utility already exists.
I think this much is enough for today. May be in future, I will do part deux of this and include more examples (or may be not).
Till then.

Friday, 3 July 2015

Campaign against if-else

Hello there, writing again after long time. Going by the rate of increase in intervals between my blog posts, I won’t be surprised if my blog ends up having higher frequency than Halley’s Comet, and may be scientists will develop some complex space-time equations to find out when the next blog will be published (in scale of light years). Constant used in these equations will be called Folderol's constant. Anyway, let’s keep these orbital calculations aside for now, we will surely discuss it sometime in future (before Hyakutake Comet comes back).

Today’s blog is not about any specific technology or framework. It explains some traits which, apart from being useful in your day to day java programming, will also make the code cleaner, more readable and easy to maintain. As the blog title says, it explains some ways to skilfully avoid ‘if..else’ statements in your program.  Please note that I am taking nothing away from usefulness/importance of this control structure (remember; if-else is a control structure and not a loop! ('Wee need to change this if loop', how many times did you say this in discussions?)) and neither do I hate if-else, however, I believe excessive use of these makes your code look ugly and complex. And, by excessive use, I mean more than 2 consecutive/nested if statements. Have a look at the below code:

if(a > 0){
      if(b > 0){
            if(b > 1){
                  System.out.println("Have you ever thought about re-evaluating your life?");
            }
      }else if(c > 0){
           
      }
}

The code is not only amateur but also difficult to debug/maintain. Imagine having to put another check in this structure as a part of change request, nightmarish, isn’t it? Below are the examples of how to deal with such situations.

1. Enhanced for loop: Null pointer exception

Here, I am assuming that everyone knows about enhanced for loop (more info here). Along with all the flexibilities it provides, there is also a catch associated with it. If the array/collection being iterated is null then, it throws a big fat NullPointerException. There are instances where the array/collection being supplied is retrieved at runtime and you don’t know whether it would be null. So, a simple way to prevent that is:

List<String> names = getNames();
if(null != names){
for(String name : names){
     
}
}

As much as I love using enhanced for loop, I don’t like an annoying if statement on top of every loop. I know we can’t get away with that, so, to prevent that, I use the following approach:

/**
 * Returns empty list if supplied list is null
**/
private static <T> List<T> safeList(List<T> input) {
    if (input == null) {
        return Collections.emptyList();
    } else {
        return input;
    }
}

private void myMethod(){
      List<String> names = null;
      for(String name : safeList(names)){
           
      }
}

Voila! We have just changed the place of the if statement. But the code now looks clean and if-free, doesn’t it? Below is an example of how to use it in case of an array.


/**
 * Returns empty array if supplied array is null
**/
private <T> T[] safeArray(T[] input, Class clazz) {
    if (input == null) {
        return (T[])Array.newInstance(clazz, 0);
    } else {
        return input;
    }
}
     
private void myMethod(){
      String[] names = null;
      for(String name : safeArray(names, String.class)){
           
      }
}

As arrays need type infotmation, the methods takes class type as an argument.

2. Multiple checks of instanceof operator

Consider a scenario where you have a reference of super class and you are getting an object of one of its sub classes a runtime. Now, you want to perform specific action depending upon the type of sub class. Let’s assume that we have a reference of Number class and it gets assigned to an object of any class which ‘Is a’ Number. Depending on the object, we want to perform specific action (Before you all jump on me with method overriding, let me clarify that specific action here means calling specific method of impl class). This is how your code will look like:


private void myMethod(){
      Number number = getNumber();
      if(number instanceof Integer){
         //cast it and do some action 
      }else if(number instanceof Double){
         //cast it and do some action  
      }else if(number instanceof BigDecimal){
          //cast it and do some action 
      }
      //And so on..

}

Imagine having more impl classes here. We will end up with a long sequence of if-else blocks (which will of course, boil my blood). We can’t even use switch as it doesn’t go hand in hand with instanceof operator. What to do here then? Below is the answer:

private enum NumberType {
    Integer,
    Double,
    BigDecimal,
    Unknown;
}

/**
 * Returns number type based on object passed
**/
private NumberType getNumberType(Number number) {
      NumberType type = null;
    try {
        type = NumberType.valueOf(number.getClass().getSimpleName());
    } catch (IllegalArgumentException ex) {
        type = NumberType.Unknown;
    }
    return type;
}

private void myMethod(){
      Number number = null;
      switch (getNumberType(number)){
      case Integer:
            //cast it and do some action
            break;
      case Double:
            //cast it and do some action
            break;
      }
}

Let me explain this in brief:

We know switch works with enums. So, we are converting the instance into enum, depending upon class type. This needs an enum to be created with all possible class types (e.g values being class names), NumberType serves the purpose here. Now, we need a method which would convert the object into enum type (using class name), getNumberType() helps us in that. And finally, a switch case to distinguish between different types and decide which action to perform.
So, we have successfully converted a lengthy if-else structure into more readable switch case. It’s good, isn’t it? Also, if we want to support more types in future then, all we need to do is to add a new enum type and a case in switch.

3. Case insensitive string comparison

Well, I am not going to provide an example for this. Java 7 included support for strings in switch case. However, there are cases where we need case insensitive string comparison, which tempts us to fall back on old school if else blocks with String’s equalsIgnoreCase() metod. Wait! There is another way to stick to the switch structure. How about providing a lower case version of string to switch (e.g. switch(name.toLowerCase()) and creating all cases with lower case strings? It kinds of solves the purpose, doesn’t it? This is self explanatory and hence no example.

4. Multiple Arguments in a method

This is not related to if-else structure but as we are already discussing code and design level stuff, I thought of including it. Have you ever come across a scenario where you need to modify the method (i.e. to add an argument) and it already contains say 6 arguments? What will you do in this case? Add 7th argument? Not if you are a true programmer.

There  are two questions which arise here. (i) When to decide whether a method has too many parameters? (ii) What is the optimal number of parameters a method should have?
Answer to (i) is, when you start googling about alternative ways to reduce method parameters, it probably has too many. And (ii) many programming experts claim that 3 to 4 is an optimal number.

Coming back to our original scenario, what to do if you want to add a parameter to a method which already has plenty? A pattern called ‘Builder Design Pattern’ comes to the rescue here. It mainly consists of creating a wrapper class containing all the parameters, creating an object of that class and supplying that object to a method (variation of DTO pattern). If we go in details then we may end up with another blog, so I will share a link of the pattern and example. Click here for more details.

5. Alternatives of nested if-else statements

At last, we will discuss the solution to the problem described at first (quite a stacky way to discuss). One alternative here is to divide the code into multiple methods (like isValid(), isGreater(), isBetween()) and call those methods, e.g. isBetween() may take a number and a range and return true of number falls within that range. If all the conditions are on an object of same class then another alternative is to write criteria/search method (e.g. search(), filter() etc) and pass the object to check if it meets certain criteria.
Another approach is to use Validator Pattern to validate object/parameters. Have a look at the example here. If you want to go a level deeper, then use can implement Strategy Pattern as well. Any of these approaches will make your code a lot cleaner.

So, this is your lot for today. Hope it will help you at some point in your programming life. I will try to be back soon, before the arrival of Halley’s Comet at least.

~ Au revoir