How Software Development is like Veterinary Medicine

15 February 2015

My dad is a veterinarian at Three Chopt Animal Clinic in the Richmond area and has been for over 35 years. When I was growing up, I observed quite a lot of veterinary procedures, far more than the average person would. But I never really seriously considered going into the field, with the most common reason (usually given about me) is my general squeamishness related to blood. I happen to love quotes and like most fields, the medical field is rife with them. One of my dogs was unfortunately diagnosed this week with an incurable liver tumor so his time with us will be short. When I was talking to my dad about the situation, he used a surgical axiom and it struck me how applicable it was to software development, particularly debugging and bug triage, which is consuming me as we try to wrap up the new identity platform integration. In that spirit, this post will look at some common medical idioms and how they also apply to the seemingly unrelated field of software development.

A chance to cut is a chance to cure

For surgeons, this axiom is exactly what it sounds like: giving the surgeon a chance to perform the procedure is giving the patient a chance at a cure. And similarly, given a problem, most software developers leap to a code solution. When time is tight and shipping dates are on the line, I think most of us believe that relaxing code freeze and giving us a chance to do what we do best will save the day. And sometimes that’s true. In my current project , at least half a dozen times and counting I’ve had to pull out what seemed like a miracle to work around the vendor’s design or a bug in the platform or just do something to help move us along. And most of the time, I asked to be given the chance to write some code in contravention of code freeze or other similar process guidelines. And that’s worked out for us. But sometimes the cure does not require surgery or more code. And sometimes the situation cannot be salvaged. And I would be remiss if I didn’t add the flip side axiom: a chance to cut is a chance to kill. As developers, there is always the chance that we will introduce a bug so we have to weigh the risk of introducing a bug against the benefit of authoring code when a situation is outside the bounds of normal processes.

All bleeding eventually stops

I’m famous at work for saying this one in reference to struggling projects and I heard it from my dad who learned it from his boss when he started practicing veterinary medicine. This phrase is a bit of gallows humor: if a patient is bleeding, either a veterinarian will control the bleeding and save the patient or the patient will bleed out (at which point it will stop). This phrase always reminds me that a tough situation will always come to an end. Sometimes that means we’ll salvage the situation, sometimes it will be ended with an unfortunate result. But all we can do is focus on trying to save it while accepting that things might ultimately break against us.

If it's worth taking out, it's worth turning in

In medicine this refers to the practice of turning into pathology any mass that is removed from a patient for follow-up analysis related to cause and effect. In this day and age when unit testing is an almost universally accepted practice, I liken this saying to writing unit tests in response to bug reports. The most common question I get from developers at work is either “How can I start writing unit tests?” or “I’m in a code-base, where should I start writing unit tests?” I usually ask “Are there bugs you’re fixing?” Bugs to me are the easiest way to start writing unit tests. If you get a bug report, write a test that proves the bug exists. Then all you’ve got to do is write the code to make the test pass. That’s it. I’m training a guy on my team as .NET developer. I got asked to look into a situation where data on-screen in an internal LOB application was being duplicated (but in the database it was fine). This code was a few years old so it took quite a bit of effort to write a test to prove the bug in the code (a number of hours in fact). But once I did that, I told my colleague “All you’ve got to do now is make the test pass.” I could have solved the problem in 5 minutes once I had the failing test, but I knew this would be a big win for him to make an actual production change in the code-base. I typically look askance at someone who tells me they fixed a bug, but didn’t check in any unit tests: how do they know it’s solved? How do they know they didn’t introduce a new one? So similar to submitting excised masses to pathology being a routine practice, developers should do likewise for bugs.

Age is not a disease

In medicine, just because a patient is of advanced age it does not automatically disqualify certain treatments. Yes, age can and is a factor, but it’s not the only one. Similarly, just because a code-base has been around a long time, that does not mean it needs to be replaced with a newer one in a more contemporary technology. At one company I worked, I supported all the systems for document collection. Most of these were Windows services that read and loaded documents received on feeds we purchased, such as EDGAR documents from the SEC. One of our most important ones was called PR Loader, which loaded press releases from a feed supplied by Acquire Media. Press releases were of the utmost importance to us for two reasons. One: press releases by SEC regulation are where companies break news and it was worth big money to us (and our clients) to have access to them as fast as posssible. And two: we supplied investor relations solutions (i.e. a company’s IR website) to some clients and it was a matter of regulation (and subject to fines) if press releases didn’t reach their site at a specific time and these same PRs came to us on the Acquire feed (don’t ask why they weren’t uploaded directly). And the PR Loader was written in a very old language that they told me was called X++ (which I’d never heard of). The loader was written over 10+ years previously and hadn’t been changed in about 3 years when my team was doing a project that called for modifying it. We batted around the idea of rewriting it in C#, but after analysis, the changes we needed to make could be added as a separate module without touching the core logic. And since management did not feel comfortable taking the risk of rewriting it (or the time it would take plus the testing), it was faster to get a developer up to speed on the foreign language and have the one remaining developer (who at the time was a senior IT directory) do a thorough code review. There is a lot of value inherent in legacy codebases. The cost of producing them originally has been amortized over a long period and represents tremendous value to the enterprise: that should not be causally discarded just because of its age.

Never be first to use a new treatment, or last

Most developers do not work at companies whose goal is to be on the bleeding edge of technology. The cost of being on the bleeding edge is you will do a lot of bleeding and for most enterprises, that is simply not necessary. For most enterprises, if there is little documentation or knowledge out on the web on a topic, it should really give pause whether adopting a technology as part of a project is a sound business decision. Perhaps it is, but often the lessons learned from it will be painful and expensive. But on the other hand, being too late to pick up a new technology puts you far behind the competition. If my company was still producing ASMX web services, it should really give us pause given the plethora of easier to use, widely accepted options out there. The real moral of this lesson: consider the full ramifications of adopting a technology and make sure you roll into its consideration the support and knowledge available out in the world about it.

One CT scan is worth a thousand neurologists

Speculation is free and easy. And in many ways, fun. But seldom does it gain us what we really need to solve a difficult situation: data. In the course of implementing the new cloud SaaS identity platform, its performance has at times not been what we expected or can accept. And the vendor has given us a lot of stories about why it is the way it is or that it will be faster in production, or even that it’s faster than we think. And there were a lot of meetings to discuss everyone’s opinions on the matter. But I did something simple: I got us data. I added event logging on top of the client proxies that simply times all calls to the vendor’s services and writes them (plus the method call name) into the event log. From there it was easy to download the event logs and write an application to parse out the times and calculate standard descriptive statistics on them. And they were eye-opening. So much so that our IT management passed them to the vendor’s management with a simple question: how do you explain these numbers? And they simply couldn’t. Thus began weeks of load testing and performance optimizations on their part with the data my login application could gather to show whether it was getting better or not. So when there’s argument and speculation, collect some data and at least argue about something rigorous.

There are plenty more medical sayings I could include here and maybe I’ll take another run at this one day, but hopefully you’ve enjoyed this only-slightly-joking comparison between the fields of medicine and software development.