Semantic Debt: A Detour into Technical Debt
People who work in data know what semantic debt is when we see it. But we don’t explain the problem well to the people who hire us to fix it. This series of blog posts will help focus some attention on a problem we see pretty much everywhere.
You’ll also see the term data debt used to describe what happens when an enterprise doesn’t pay attention to their data. I don’t much like “data” as an adjective because the word can point to a lot of things. I call this problem “semantic debt” because it happens in the enterprise semantic layer. That’s something more people can locate, even when its just spreadsheets.
Surprisingly enough if you check a search engine for semantic debt or even data debt you’ll get very little back. That’s because for the most part problems in the semantic layer are assumed to be a minor kind of “technical debt,” as software engineers understand it. We’ve been automating data collection and storage for 50 years, and you’d hope we had better words for what kinds of things go wrong after automation.
We’ll explain what technical debt is in this post.
Wikipedia defines technical debt as:
It seems obvious you can measure the cost of doing the right thing versus the expedient thing in terms of people’s time. At the end of your planning cycle, when decisions are being made about the next set of deliveries, you define technical debt as the delta in personnel cost, in developer-weeks, between the right solution and the expedient one. That personnel cost estimate is the minimum tech debt you create with the expedient solution. This is usually how an experienced Agile team will try to manage the metric.
People think tech debt is pernicious, and often it is. Sometimes product owners make expedient decisions they shouldn’t, and the developers and/or architects are right. Sometimes the developers and/or architects are wrong. But sometimes tech debt evaporates all on its own.
In the wild we usually limit the scope of technical debt to identifiable solution(s) we know work better than the actual one, so we aren’t getting too counterfactual. The right solution could have been implemented but wasn't. Some software teams would define tech debt as:
A team can say a system’s codebase is in technical debt when there’s actual commits somewhere in the pre-production pipeline that would have made the system work better but that aren’t in production because of some product-owner-approved reason.
You need that precise definition because (a) no product manager wants to be blamed accidentally for tech debt, and (b) sometimes engineers will speculate. So we have to narrow the range of possibilities for how much better the system could be! to what we know for sure. But that also means we can ask the developers how long to put that code into production, and get an actual dollar-cost measurement of tech debt.
But sometimes the expedient decision turns out to be the right one, and sometimes for the right reasons too. That’s the primary problem with counterfactual estimates like “tech debt”: They aren’t actual facts. We think the choice we made is a bad one, but we don’t actually know for sure yet.
Consider the possible future where you have to pay much more to retire the right solution, because you invested deeply and that locked you into a future you now don’t want to be in. Suppose in 2019 you booked tech debt by deferring the facilities management software upgrade you needed to manage your sprawling corporate campus. Are you still in debt in 2023 once you sell off the empty facilities? Sometimes tech debt evaporates on its own because the business changes faster than the software needs to.
Often enough too someone comes along with brand new software that completely reshapes your work, like the spreadsheet did for accountants or the CRM for customer service. Or, in our practice, the replication engine. Suppose that in 2019 you estimated six months of expensive engineering time to build a custom pipeline-and-dashboard combo for your product owner. The personnel time cost is now likely closer to two weeks if you use modern tools and an experienced data engineer.
Eventually, as we get better at knowing it when we see it, we realize that tech debt has to accumulate and persist over time before it’s genuinely negative. You need more than one estimate to work out true. So you can’t necessarily spot at which exact point a system starts to accumulate pernicious technical debt. While practically it only takes one decision to create technical debt, you need several decisions to create inevitable rework cost before you begin to see the real pernicious debt, the kind you regret.
The logic of tech debt is thus something like:
For any system S there is an $R or “rework cost to improve the codebase” and $V or “value provided by the current codebase”, and technical debt happens when $R > $V
There’s some limitations. First we have to make sure we have an identifiable solution that would work better; we can’t speculate. And second, sometimes tech debt evaporates all on its own.
In my next post, I’ll walk through why, if technical debt = “rework cost,” then semantic debt isn’t technical debt.