Every once in a while the value of a coding practice like unit testing is reinforced, yesterday was one of those days so I wanted to share with my future self in case I ever start to doubt. Also, the set of technical practices that I first learned as XP (Extreme Programming), are not as common as I used to think, so this is my small contribution to improving that.
The Problem
The code was using an ID received from an external source as an internal unique identifier. The identifier was assumed to be unique across objects of the same type and consistent for the same object across all users. In other words if user one accessed item A from the external source and user two also accessed item A they would both receive the same ID. This seemed like a reasonable assumption for two reasons. First, the ID in question was a UUID, therefore likely completely unique across any scope. Second, the documentation stated that the ID was consistent across sessions.
My first mistake is that I trusted the documentation and didn’t verify independently that it was true. Based on this misplaced confidence I used the ID to connect the items from the external source with extra information maintained internally and accessed by different users. After much code was written, databases were created, REST services authored, etc. the perfidy of the documentation was exposed. The ID was different for each user that accessed the external source.
To fix that mistake all of the code now using the external ID would have to use a different ID instead. The easy change would be to just change the initialization of the ID to use a different value. Unfortunately, even though the external ID didn’t fit the internal use it was still needed to refer back to the original external source. That meant that instead of just initializing with a new value the code needed to track both an internal and external ID for each item.
First Defence – Don’t Use Primitive Types
One lesson that I have, slowly, learned over time is to avoid using primitive types in my code whenever there is meaning. The ID from the external source is not a string, or a UUID, instead it should be given a type that defines how it is used internally, not what type it is externally. Since I was using the external ID to refer to objects of type Foo, it became a FooID. Everywhere it was used in the code it was passed as a FooID instead of a string, right down to the DB layer where it was converted from/to that type.
FooId already meant the stable internal identifier for the information and was used that way throughout the code. For that reason it made sense to change its initialization to a different value, something that was stable across users. That meant that a new type was needed to keep track of the external ID, FooReferenceId. Here’s where the value of using explicit types shows. By changing the method signature of the methods that now needed the external ID to FooReferenceId instead of FooId, the compiler found most of the problems for me. Note that I use, and strongly prefer, statically typed languages, YMMV in a dynamic language.
Second Defence – Test Coverage
Over the life of this project I’ve been doing a pretty decent job of test driven development, keeping the behavioural coverage high. After making the compiler suggested changes I ran the test cases and several tests failed. Mostly it was test data problems where the test was providing an external ID when an internal ID was expected. To catch these types of failures don’t forget to use decent random data for your test data. Using “1” or a copy of the same UUID for every ID in the test data would cause false positives in subsequent tests.
In at least one case though I’d copied and pasted a string conversion and ended up using the same DB field for both the internal and reference ID. The test case failure made that obvious so it was simple to fix. The value of tests to catch simple errors like that shouldn’t be underestimated. Finding something like that in the full running application can be surprisingly hard.
After fixing the test data and the copy/paste error the tests were green and I felt pretty confident.
Test Coverage Expanded
The first run of the resulting build was pretty good but there was one failure. The REST call to get the item from the external source was failing because it was still using the internal ID. Types didn’t help there because that was one of the points where the internal value had to be converted to a string for use in the REST call. Now that I’d seen an example of a failure that the tests didn’t catch I added a new test case that covered it and ran it to ensure that it failed. Then I made the change to use the new FooReferenceId value in the REST call and re-ran the test case. With it passing I was now protected against future mistakes in that area.
Taking the time to improve tests each time a problem crops up makes future problems less likely and easier to fix.
Retrospective
Overall it was a pretty good outcome. I made a rookie data analysis mistake but at least I covered my tracks well with decent data typing and good test coverage. The code came back together quickly and the coverage is better than before the change. I even found a couple of decent refactoring opportunities and got rid of forty or so lines of code along the way. I’ll give myself a solid B on that effort.
Don’t forget to reinforce the good to yourself as part of your retrospective. Most people’s brains are more than happy to dwell on their failures, consciously reviewing the positives can help reduce the erosion of self.