Evaluating and Managing Legacy Code Quality with Simple Metrics
The article explains how to assess a project's code quality using four key metrics, compare error density between codebases, manage legacy code with the Scout Camp principle, and choose reasonable thresholds such as cyclomatic complexity, providing a practical approach for continuous delivery.
After a long hiatus from writing on the public account, the author apologizes to fans of "Continuous Delivery 2.0" and promises to teach a simple method for dealing with "ancestral" (legacy) code.
1. Assess your project's code quality. The author previously performed a code‑quality evaluation by comparing a company's product code with Google open‑source code, using a straightforward method that anyone can replicate.
The four chosen indicators are:
(1) Cyclomatic complexity of functions or methods;
(2) Total lines of a single function or method;
(3) Code duplication rate within a single file;
(4) Fan‑in/fan‑out ratio of classes (for Java code).
After setting appropriate thresholds for each metric, both codebases—of comparable size—are scanned. The results are compared using "Error density" rather than absolute numbers, e.g., the average number of errors per hundred lines of code.
Because the product is continuously developed, statistics collected at different points in time allow calculation of how many new errors appear per additional hundred lines of code, revealing the relative code‑quality level of two software systems.
2. How to manage code quality. Once the four metrics and their thresholds are agreed upon, you can start managing code quality. For extensive legacy code that cannot be fully refactored, the author recommends applying the "Scout Camp principle" (see Chapter 9.2.1, page 139 of "Continuous Delivery 2.0").
3. Choosing reasonable thresholds. You can define your own thresholds for each metric as you see fit. For example, the author shows a chosen threshold for cyclomatic complexity in the figure below.
In theory, the numeric result of scanning cyclomatic complexity can be regarded as the number of test cases for that function, because it corresponds to the number of execution paths in the function's statements.
That concludes the article.
Continuous Delivery 2.0
Tech and case studies on organizational management, team management, and engineering efficiency
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.