Why Metrics Fail: Historical Lessons, Industry Examples, and Common Pitfalls in R&D Efficiency Measurement
The article examines why measurement systems often backfire by recounting historical tax‑related mis‑metrics, modern corporate examples like Haidilao, and a series of fundamental mistakes in software R&D efficiency metrics, urging a shift from metric‑driven thinking to purpose‑driven measurement.
The author, a senior technical expert at Tencent, reflects on a previous piece about a delivery rider trapped by a system and introduces today’s focus: the challenges of measuring software R&D efficiency.
1. Historical cases of measurement failure
In 17th‑century England, the government switched from a "fireplace tax"—which required inspectors to enter homes—to a "window tax" that could be assessed from the street, prompting homeowners to block windows, create skylights, and even develop nearsightedness, while the tax authority collected little revenue.
2. Contemporary cases of measurement failure
At the restaurant chain Haidilao, strict performance metrics forced wait staff to harass customers—providing eyeglass cloths to glasses‑wearers, refilling drinks until the cup was three‑quarters full, and wrapping diners’ phones in plastic bags—ultimately degrading the user experience.
These examples illustrate how poorly designed metrics can generate unintended, harmful behaviors.
3. Root causes of measurement failure
The author argues that the core issue is not a lack of tools or methods, but a mindset that still relies on industrial‑era scientific management, which is ill‑suited for the modern “byte‑economy” of software development.
4. Common pitfalls in R&D efficiency measurement
4.1 Using easily obtainable quantitative indicators – Easy metrics like lines of code or work hours often have low value, while harder‑to‑collect data such as product user value or NPS provide richer insight.
4.2 Attempting single‑dimensional measurement – Complex phenomena require multi‑dimensional radar charts; relying on one metric can be misleading.
Low defect counts must be considered alongside delivered story points.
Long overtime hours need context from delivery volume, code impact, and defect rates.
4.3 Relying on manual data entry – Manual entry introduces bias and time‑distortion; automated collection (e.g., Tencent’s dual‑stream model) preserves data integrity.
4.4 Tying metrics to individual KPIs – When metrics become personal performance targets, engineers may game the system, undermining true quality.
4.5 Treating metrics as goals – Metrics should serve goals, not replace them; otherwise they can pull effort in the wrong direction.
4.6 “Star‑following” metrics – Blindly copying practices like OKR from successful companies without contextual fit leads to wasted effort and internal competition.
4.7 Blindly building a metrics data platform – Large‑scale metric platforms are costly and often ineffective without clear improvement objectives; a focused, insight‑driven approach yields better returns.
In conclusion, the author encourages readers to redesign measurement systems that truly serve business objectives, avoid the listed traps, and continuously adapt the mindset to the evolving digital landscape.
DevOps
Share premium content and events on trends, applications, and practices in development efficiency, AI and related technologies. The IDCF International DevOps Coach Federation trains end‑to‑end development‑efficiency talent, linking high‑performance organizations and individuals to achieve excellence.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.