Friday, 4 October 2013

Tortoise SVN Pre-commit hooks: Hook yourself to safety

Writing something after a long time, well; not that much long but relatively long I should say. This has been a fantastic week for me, both footballistically and technically. Footballistically, Arsenal won back to back matches this week. In the match played in midweek, they showed some scintillating football. It contained crisp passing, swift movements, darting runs, insane finish and gorgeous understanding between the players. It was like watching porn on a football pitch. Reaction of Arsène Wenger (manager) after the first goal was priceless! Let me tell you, he deserves each and every bit of it. The efforts he put in to create such team were immense. He created the whole empire right from the scratch and now he is enjoying the fruits. What a guy he is! The talent he is having is unparalled. (*deviating from topic alarm*) Let me come out of this fantasy and start discussing what I am here to discuss about. Change of topic is good sometimes by the way.

I said this has been a good week for me technically also. I developed a solution for a very specific and rare problem but it was very useful. Let me provide a preface first.

We are using Tortoise SVN as a version control tool. The repository resides at client side. Hence, whenever you commit something to the branch, it directly goes overseas and clients can see the changes. We are using JUnit framework for unit testing and hence, test classes also go overseas. In test classes, generally we have many crap things termed as ‘test data’. Those things include database urls, passwords, file paths, encryption keys etc. Before sending these classes overseas, we have to clear all these data.

In English, there is an idiom which says, ‘To err is human’. All the members working in my team are humans (FYI!). Well, not sure about one or two (including myself) but rest all are humans. Hence, we often forget to delete those test data before committing the files. In last week, for the umpteenth time, we received an email from client saying that our classes contained sensitive data and all. It was a typical client escalation. They said that in spite of saying the same 12672 times, they were still seeing sensitive data being committed. This spurred flurry of meetings and discussions on our side and we decided to introduce something which would prevent a developer from committing the file if it contains any sensitive information. 

I googled a lot about this and found an approach to prevent this. Tortoise SVN provides the facility to run our own scripts on various events like Pre Commit, Post Commit, Pre Update, Post Update etc. The most suitable event for me was pre commit event. What I did was; I created a java program which would search for particular text in each and every file which is being committed. It would throw an exception if a match is found which would make the commit process stop. These scripts are called hook scripts. There are 2 types of hooks: Server side hooks and client side hooks. Server side hooks are installed on server and they would work for all the commits whereas client side hooks need to be installed on each and every client machine. We did not have SVN server hence server side hooks were not of any use for us. We then decided to go with client side hooks.

Let me explain you how SVN works. When we check the checkboxes against file names and click on commit, SVN creates a .tmp file in Temp folder of user (considering windows machine). This file contains multiple lines. Each and every line of this .tmp file is full path of file which is being committed (i.e. if 4 files are being committed then .tmp will have 4 lines). SVN also supplies the path of .tmp file to hook script (there are other arguments also which are passed but we don’t require those). So, the script I wrote used to read this file line by line then, read the file present at that path. If it encountered any keyword (like password etc) then it threw exception resulting in failure in commit process. I created a batch script (.bat file) which called jar file (containing my code). Following are the steps to attach a hook script:

  • Open Settings window of tortoise SVN and go to hook scripts as shown in the image below:
  • Click on ‘Hook Scripts’. It will open hook scripts menu as shown below:

  • In Hook Type, select pre commit.
  • In the Working copy path, select the path where you have created the branch.
  • In the Command Line to Execute, select the script which you want to execute (.bat file or .exe file).
  • Check those two checkboxes. Click on ‘OK’ and ‘Apply’ and Voila! Your hook script is ready.
The only situation in which I think this script will create problem will be the one in which you are planning to commit large number of files, each of large size. It will take some time to check all the files. But, I guess the delay is any day better than receiving vitriol from the client resulting in kick up the back side.

I have attached sample batch file and jar file which searches for string “password” for all the files having extensions in (.java, .jsp, .js, .xml, .config, .properties and .txt). You just need to change the path in the .bat file which is executing the jar file. You can download it from here.

A special thanks to my colleague Niket Patel for helping me out on this. Cheers mate.

That's yer lot for this time. See ya.

2 comments:

  1. Excellent article. I have a question though. What if my file contains genuine word password which is a variable name or part of a method name? Is there an option to skip these?

    ReplyDelete
  2. The script which I have created performs rigorous search. You can't have anything that contains the word 'password'. If we want to skip method or variable names that contain these keywords then we can modify the script accordingly. What we can do is, we can modify the script so that it only searches for keyword between the quotes or after the equals sign (in case of property files). We can use regular expression search for such requirements. Java provides good support for regular expressions.
    However, in case of xml, config and txt files, the situations can be tricky as they might not have quotes or equals sign. What we can do in this case is, we can use different search patterns for different extensions. That can be done with minimal changes to code.
    I would still prefer to go with the rigorous search. I think the chances of a variable or method name containing password are rare.

    ReplyDelete