Assumptions are a fundamental part of software development. In some cases we can make those assumptions explicit with actions such as defining data types or adding assertions. In other cases, those assumptions are implicit and must be validated and maintained without the help of the software itself.
For instance, consider an application that starts with just US users. The application has a users
table in the database with a phone_number
field. This field is used throughout the application and, because users were only ever US users, it’s always assumed that the phone number is a US phone number. This assumption may be embedded in places such as calls to SMS messaging services and in the UI. You might see smsService.sendText(user.phone_number, "US", "hello ...")
as well as places on the ui where the phone number is formatted by prefixing the +1
extension and using the US format (XXX) XXX-XXXX
.
Some time later, the application expands to handle Polish users as well. Now everywhere we’ve assumed the phone number to be US needs to be updated, but finding all the places we’ve made the assumption is non-trivial. Most modern IDEs have features for finding all references to a field on a given type which would be useful, but would also show usages where no changes are needed based on country. It also wouldn’t include usages where the field is passed to another object or where the backend sends the field to the frontend. We could do a code-wide search for phone_number
, but again this would include all usages of the field rather than just the ones making the assumption. This would be even less feasible if the field had been instead just called number
and the search then includes any piece of code or comment which has the text number
.
This type of problem could be solved by having a registry of assumptions and the areas where they’re used. Such a registry would make the above example much easier and would also give us the ability to know what assumptions we are making in the application. One straightforward way to do this would be to simply leave some sort of assumption tag in the comment in the code such as // @assumption:phone-number-us
. This would certainly help with the refactoring from the above example, but it wouldn’t help in cases where the assumption is used outside the code base. The assumption might have been assumed in places where we can’t simply leave a comment with the tag, such as in the schema of the database or in the configuration of external services. In those cases, we would need some sort of registry of the assumptions that exists outside the code itself, such as in a documentation system such as Confluence.
While there could certainly be significant value in having such a registry of assumptions, there would also be a non-trivial cost to maintaining it and lots of potential pit-falls. From personal experience, it also seems like the time lost from having no direct way of finding where assumptions are made is probably less than the time it would take to actually track assumptions. Still, it seems possible that a strategy could exist that could provide a cost to value ratio that justifies assumption tracking.
I’m curious to know what thinking has gone into this and if there are any services or methodologies that are used. Googling led me to this “Assumption Management in Software Development” paper from Carnegie Mellon, but the approach it offers seems to be to use XML embedded in code comments which would be both too verbose and limited to tracking the assumptions as they exist in the code base alone.