Monday, 27 January 2014

Code Quality–Different ways to measure it


In software industry the word quality holds different meaning to different people. For a developer it means different, for a tester it means different, for the management it means different, for the client it means different. The perspective of each person is different. The tools used by each person to measure quality are different. The methods used to measure the matrix are also different. In todays post I am going to illustrate the quality metrics for software developers.

Reasons for quality check

There is lot of material written on the net for the need to do quality checks in software. The oblivious need it to reduce the number of bugs and improve the reliability and efficiency of the system. Most of the times when we say quality check, the first thing that comes to our mind is the test team doing some manual or regression tests. That is the functional quality of the product or the system. But what I am interested in is the quality of the code that is being written by the developers. How do we measure the quality of the code? Here are some of the ways used for measuring code quality.

Method 1 : Code reviews / Peer Reviews

One of the oldest form of quality check is the Peer Review. Once the functionality is developed by one developer, it is reviewed by someone else in the team. Typically this is done by a senior developer who is aware of the functional requirement as well as technically sound to find out any deficiencies in the code. In some cases it is done by the technical lead or project lead. The idea here is to get the second opinion and try and improve the code.
If your team follows Agile or Extreme Programming (XP) methodology for development, peer reviews can be reduced a lot by using a technique called Pair Programming. Most of the code is already getting reviewed by the person you are pairing with during the development phase itself.
There are benefits of this method as well as some disadvantages. One of the advantage is that it can help reduce bugs because some of the things can be caught during the review which could have led to bugs. It can also help different people approach the same problem in different perspectives and choose the best solution. It also helps in evaluating a developers performance in annual appraisal Smile
Lets look at some of the disadvantages. The method is rendered ineffective if the person doing the review is biased and is not open to new ideas or latest trends in the technology. Let me take an example of the biased approach. Assume an individual is used to particular style of coding. While reviewing some other developers code he might expect the other developers to follow the same coding style. As an example consider the case of using a ternary operator in a simple if else statement or using the if else in its expanded state. Some people have their own preferences with the way they define the code structure. They want all the variables, constants, methods to be organized into regions. While others don’t prefer this style of coding.
I remember once one of my ex-colleague told me his experience with his technical lead. My ex-colleague had implemented a piece of functionality in C# using LINQ and Lambdas. During the code reviews his tech lead asked him to rewrite the code without using LINQ and Lambdas because the Tech Lead did not know anything about LINQ and Lambdas. I am sure most of us would have gone through similar situation at some point of time in our developer lifecycle. I know lot of instances where the code was rejected because a variable was not named to the liking of the reviewer.
The point I am trying to make here is that manual code reviews are left to the discretion of the individual. What feels right for one person might not be the case for someone else. They are time consuming and in reality can delay the delivery of the software if there is lot of rework needed to fix the reviewers comments. If these manual code reviews are used purely to improve the quality of the code they are fine. But I know in many organizations they are used as a measure to rate the developers during annual reviews. To me it sounds absurd.
Long ago I had a personal experience when I was working as a Technical Lead on one of the project. The project lead asked me to review each line of code and create an excel sheet with the name of the developer who was responsible for writing the piece of code which was supposed to be refactored. I thought it was an utter waste of time. First of all, reviewing each line of code written by your team mate, means you don’t show trust in their ability. And even if I had so much time in the world to do this nonsense thing, I knew where it would have ended. It would have come up during the annual reviews filtered by the name of the developer and with some pivot table applied to it to prove how useless the developer was. Another thing that stopped me from doing it was that the developers on the team would have been prompted to fix only the issues found in their code and ignore other piece of code which is totally against the principle of Collective Code Ownership.
My preferred way of dealing with such issues is to come up with a well defined coding standard that is agreed by all the team members.And even after that if someone violates the coding standards I would prefer to talk to the individual to try and understand the reasons behind doing so instead of creating an Excel sheet which doesn’t hold any relevance after a short period of time. Having a coding standard might solve majority of the issues but there will always be the edge cases which can go either way. How can we overcome such issues?

Method 2 : Use Standard tools for code analysis

One way I have used previously to address these issues is to use the industry standards and best practices. That way you are at least following the footsteps of people who are bound to influence the majority of people. If your team is using Microsoft platform and tools for development then you can take advantage of some of the tools provided by Microsoft. Some tools are built into the Visual Studio itself while others are available as additional add ins or plugins. There are tools which can be used for checking the consistency of the code structure. Two such tools which come instantly to my mind are FxCop and StyleCop. FxCop is used for managed code analysis and StyleCop for static code analysis.
There are some rules which are common to both and some which are unique to each of them. One of the example where FxCop can be helpful is to ensure the Framework guidelines like Abstract class should not have a constructor. Similarly StyleCop can be used to ensure that there are no multiple lines between method definitions in a file. Both these tools can be integrated with theContinuous Integration (CI) build.
These tools are good for bringing in the consistency in the code structure. We can ensure that someone is not coming up with their own naming style when rest of the team is following a particular set of rules. Is it good enough? Lets assume you have integrated both FxCop as well as StyleCop in your build process, there are no errors reported after the build is completed. Does that mean the code is quality code. Definitely not. Just having a code that passes the FxCop and StyleCop rules does not ensure that it is good code. There has to be additional measures to ensure the quality of code.

Method 3 : Use Code Coverage

I assume most of the teams are using Test Driven Development (TDD) or at least plan to do so in future. If not then you can skip this section. When we are developing software using TDD, we can make use of Code Coverage feature to measure the percentage of code covered as part of unit tests. Again this feature can be integrated into the build process. Generally minimum 80% code coverage is considered as a standard. Now a days I find teams striving to achieve 100% code coverage. This link here demonstrates how to get started with Code Coverage in Visual Studio.
The Code Coverage feature is available only for Ultimate and Premium editions of Visual Studio. If you are using other editions of Visual Studio, you can use other free tools which are counterparts of the Code Coverage. I have used TestDriven.Net which used to integrate even with VS2003 IDE. If I am not wrong then Code Coverage was introduced with Visual Studio since 2005 release. If you are still one of those people using VS 2003, I would recommend using TestDriven.Net. We had integrated it with our build in couple of projects that I had worked in my previous organization.
There is also an Open Source alternative which I haven’t used personally but have heard from few people called PartCover. I can’t comment much on this as I haven’t had any experience in using it. But if you are interested you can give it a try.
There are some critical issues related to code coverage. Using code coverage as a measure of quality can be quite misleading. Let me share my own experience here. In one of our project we were using Code Coverage to decide the build quality. If the coverage was below a certain percentage the CI server would fail the build. This was to ensure that all the newly built software was properly unit tested. But as always there are people who know how to get around in certain situations. The project lead of that particular project did not understand the importance of unit tests and the value add that we got from them. He assigned a junior programmer to write unit test for all the classes which were not covered. The junior programmer worked alone on writing those unit tests due to tight deadlines. He ensured most of the classes were covered to ensure the build was green. But the tests that he had written were completely useless to say the least. Most of the tests did not have any asserts to verify the expected functionality. In essence all he did was to make sure the unit tests executed the production code without really verifying anything.
This is an abuse of TDD. If your team does not believe in the principle of TDD there is no point writing unit tests just to pass the build. Its a waste of time and effort and most annoyingly a maintenance headache. Other developers in the team had to spend hell lot of time to fix those tests when the issue was identified. By the time it was identified the person who wrote those tests was already out of the organization.
My suggestion is that if you are using TDD make sure you write test which are meaningful and help you validate the code. What's the point if you have 100s of test and none of them verify anything in the end. Now assume you have one of the best code coverage statistics for all your projects, does that mean you have good quality code? I am not entirely convinced. Let me illustrate why. You can have the best coverage, but the developer might have written code which is 1000 lines of executable code in a single class file. There could be methods which are 500 lines of executable code. This scenario was quite common with developers who started doing Object Oriented development after spending years in developing procedural code. Believe me those are not the only creatures writing such code, I have seen such methods and classes myself written by people who have worked with only Object oriented languages right from day one of their career. Its bit sad that people don’t really make use of Object Oriented features to come up with code which is structured correctly.
You might ask me what’s bad with such code? The answer is not so difficult. Its a maintenance nightmare. You might know what is happening inside the class or method while developing it, imagine what will happen when you have to revisit the code after 6 months, may be 1 year later or even 2 years later. Spare a thought for the developer who is going to refactor that piece of code after you have moved onto a different project or a different company. So how do we take care of this situation?

Method 4 : Use industry standards such as Cyclomatic Complexity

Long ago I had written couple of post related to DIfferent Metrics available in Visual Studio and how to use Code Metrics. You might want to read them before continuing further.
One of the commonly used matrix to gauge the quality is Cyclomatic Complexity. It is the measure of the decision points a function has to evaluate. Decision points are measured by the number of logical statements such as an if statements, switch statements, for loops, while statements etc. At a class level its the sum of Cyclomatic Complexity of all the methods and similarly at the Module or Assembly level it is the aggregate of the Cyclomatic Complexity of all its child elements. Cyclomatic Complexity can tell us approximately how many unit tests will be required to fully cover all the code paths within a function. Different teams have different acceptance levels for Cyclomatic Complexity. In general the recommended value is 40. If this value is higher, we should try to refactor the method to make it more readable and maintainable. It is far more easier to understand a method which is 10 to 15 lines then go through a page full of code and figure out what is going on.
There are no hard and fast rules related to Lines of Code (LoC) per method, class etc. But a general guidelines is to make the methods smaller and self explanatory. If we follow that, most of our methods should have between 8 to 15 lines of code. At a class level this should generally be around 100 to 150 lines. You can argue that there can be classes with more than 200 lines none of which have methods more than 15 lines. This does happen in real life. You’ll find that the Cyclomatic Complexity for such classes will be higher compared to the smaller classes.
Can we rely entirely on Cyclomatic Complexity as our preferred measure of code quality? Not entirely. Because we’ll come across situation where we cannot split methods and classes beyond certain extent and their complexity is on the higher side. So how can we arrive at a reliable measure. In Visual Studio Code Matrix report we can also find Maintainability Index which indicates how easier or difficult it would be to maintain a particular method or a class or an Assembly. Along with maintainability index we can also use another metrics called Depth of Inheritance. It is a measure of how many levels of inheritance are involved. All these matrix can be helpful in deciding the quality of the code. I remember once refactoring a very lengthy method with Cyclomatic Complexity of more than 300 into a smaller one with less than 50 Complexity points. These metrics are available from within the Visual Studio development environment. Using these metrics should be a good starting point to build quality code.
Once the whole team becomes comfortable with using these tools and techniques we can go a step forward and automate this using some dedicated tool like NDepend which can give us numerous other quality metrics including Cyclomatic Complexity. NDepend supports a language called Code Query Language (CQL) which can be used to query runtime code attributes and define custom rules. For example we can define a rule in NDepend which says a warning or an error should be reported if a method has more than 50 lines of code and has Cyclomatic Complexity is more than 50. As of today NDepend can measure up to 82 different metrics related to the code quality. All these comes with a cost. NDepend is a commercial product and you need to pay for the licences. Having used the trial version I think its worth investing in NDepend for long term benefits.

Conclusion

As we saw, the code quality can be measured in different ways. Each measure can be used to validate different aspect of the code. We can never rely on one single tool or technique to measure code quality. In my opinion we should use a combination of two or more metrics so that the short comings of one technique might be covered by the other. The biggest advantage of using a standard metrics like Cyclomatic Complexity is that it is not left to the discretion and interpretation of one or two individuals. You’ll get the same result if we measure the metrics on the same piece of code 100 times or even 1000 times.
We cannot rely 100 percentage on the tools like we saw in the example I mentioned above in case of unit test code coverage. We need to get the balance right between the manual and the automated tools and techniques. There also needs to be right mix between the tools that we use. These tools and techniques are there to help you write better code. If there are 10 tools available in the market we shouldn’t be using each and every one of them. If a particular tool slows you and your team down instead of helping you its better to look for the alternatives. I am sure there are more tools and techniques out there to improve the quality of code code and I would like to learn more about them.

No comments:

Post a Comment