Sunday, April 27, 2014

Tracking Progress of a Rails Rewrite

I inherited a challenging codebase about 14 months ago. The code did not have tests, there were methods with 100-200 lines of code in them filled with conditionals and iterators. It was obvious: we needed a rewrite. Ok, we were not thinking about The Big Rewrite, but we knew the code had to be cleaned up. I had all the support from the leadership team, but I was asked for one thing: provide a metric to track progress.

A few months went by and I did not have an answer. I couldn’t tell how fast or slow we can get done with the cleanup. Is it going to be six months, one year, maybe two? I did not know what to say.

My goal was extracting business logic from Rails controllers and models and putting them into lib/services. I wanted to structure the code into stateless, reusable actions with LightService, decoupling it from Rails as much as possible. I trusted these services as test coverage for them was fairly high. I knew no matter what business logic I had to implement, the small service classes were able to take anything I would throw at them.

The app used - the now unsupported - Data Mapper library that we wanted to get out of and transition over to Active Record. We put all our AR classes in the app/models/ar directory under the "AR" namespace. We kept these models clean, well tested. Now we trusted everything under lib/services and app/models/ar.

We also started to extract logic into RESTful controllers that had very little logic in them. These controllers were only a handful, maybe half a dozen. I realized we did not do much action clean up in the existing controllers. In fact, we shifted the responsibility to new controllers and services that we trusted.

One of my favorite Ruby code analysis tool is flog. Flog gives you complexity points based on "ABC" metrics: assignments, branches and conditionals. You can run flog against your entire controller and model code and you get a complexity point. If you don't have much logic in the views or JavaScript, that number is your application's complexity.

Comparing trusted and untrusted total lines of code can provide a number, but I am not sure how much I could trust that. 10 lines of really crappy code loaded with iterators and conditionals compared to 10 lines of clean, readable and tested code just does not provide a one-to-one ratio. Why not looking at this problem from the complexity side? 5 point flog complexity can be on 1 single line in a crappy code, but it can be on 4 different lines in a method in the cleaned up code. Amount of coded logic should be expressed in complexity points and not in the number of lines of code.

I realized I could actually trust the flog complexity points to compare the trusted and untrusted code ratio. I easily calculated the total controller complexity. By subtracting the trusted controller complexity from the total complexity I had the untrusted complexity for controllers. I did the same for models. I put all the complexity from lib/services under the trusted bucket. Deviding the total trusted complexity by the untrusted code complexity provided the trusted/untrusted code ratio.

Here is an example of how the calculation worked:

Total controller complexity:8945
Trusted controller complexity:489
Untrusted controller complexity (8945 - 489):8456
 
Total model complexity:1498
Trusted model complexity:249
Untrusted model complexity (1498 - 249):1249
 
Trusted services complexity:845
 
Untrusted total complexity:9705
Trusted total complexity:1583
 
Trusted/Untrusted code ratio:1583/9705 * 100 = 16.3%

The numbers in the example above are made up, they don't reflect the code of my current employer.

A month later you can do the same calculation and you’ll see that the Trusted/Untrusted code ratio is 19.5%. Well, look at that, 3% of your code just gained trust! That new 3% of the code is easy to change as it's small, passes all the tests, communicates intent and has no duplication.

By yielding 3% more trusted code in a month, you will need more than 2 years to clean up the existing code base unless you can accelerate the code clean up somehow.