Monday, March 28, 2011

The Operation was a Success

It's likely that you're pretty bad a diagnostics, i.e, figuring out what's wrong when something doesn't work the way you want it to.

Your diagnostic skills come into play whenever your car won't start or the television plays without sound. You use your diagnostic skills to understand why you're not quite feeling yourself today or why you gained a couple of pounds last week. You use your diagnostic skills to determine why your child isn't doing well in school or what caused your relationship to lose its spark. You use diagnostics to tune a guitar or to determine what to wear or to edit a story.

Being able to quickly and accurately diagnose problems is one the single most critical skills that you can acquire. It can improve your relationships, save you gazillions on healthcare costs, win you promotions, make you money, and save you tons of time. And yet, you're probably terrible at it.

One of the reasons you're terrible at diagnostics is that you rarely ever diagnose. When something doesn't work do you stick with it until you figure it out, or are you quick to call someone or to google an answer? Once you find a solution, do you simply reapply it the next time the problem occurs or a similar problem occurs, or do you take time to figure out why it happened again? Most likely your diagnostic skills have been supplanted by look-up-and-remember skills.

Look-up-and-remember works much of the time, because things that break tend to break in the same way repeatedly and because, in a browser-on-every-phone, socially-networked world, access to experts is just a sweep-of-the-thumb away. However, there are many cases where lookup and memorize don't work so well.

Cause and Effect
One of the hardest things to learn in diagnosing systems is:
oftentimes, the thing that's not working
is not the thing that is broken.
The corollary is:
the thing that is broken
may appear to be working just fine.
For example, if a child is struggling with math, we often do things to help her with math. However, the cause of her challenges with math may have nothing to do with math. It could be her eyesight needs assistance. It could be that her math class is the one with the kid who picks on her incessantly. It could be that math class occurs just before lunch when her blood sugars are crashing or just after lunch when she's hyped up on preservatives, sugar and caffeine.

More often than not, the problem you see is the side effect of something that appears to be unrelated or something that in isolation appears to be working just fine.

It's a System Thing
Last week, I was working with a group of engineers trying to solve a curious problem with a large database system. The system has several software components that operate independently of one another. One program uploads new data to the database. A second program retrieves the data and generates reports. A third extracts the data, analyzes it and stores the analysis in yet another system.
In a large system all the parts can work
and yet the system can fail.

The problem was that whenever the analysis program connected to the database to retrive new data, the database told it that there was none. Looking at the database directly, one could see lots of new data. However, the analysis program didn't. After a week in which no one had made any progress in diagnosing what was wrong, I decided to figure it out. So, I started asking questions.

Everyone assured me that the system had been functioning just fine for months and that the problem was a new phenomenon. Each of the engineers told me that "his" program was working the way it was supposed to work, proudly demonstrating that it indeed did what it should. And yet, the system wasn't working. So, I donned my virtual deerstalker and started doing a little diagnostic detective work.

The first thought that occurred to me was that if the components were working individually, the failure must result from some interaction when they run concurrently (at the same time). So I asked, "What happens when you run everything at once?"

Silence and blank stares that shouted, "Huh?"

I explained, "Diagnosing a system by trying components sequentially is like diagnosing traffic jams by allowing one car on the road at a time. Each car does just fine. The problem occurs when you bring them all together."

Apparently this hadn't occurred to the guys working the problem. So we launched two programs at about the same and voila, the second one immediately managed to disconnect the first one.

Was It Really Working?
Here's another important diagnostic tool:
Verify that the system ever worked!
I can't tell you how many times I've helped people save what hair remained by asking the question, "Are you sure this ever worked?" and following up an affirmative answer with, "How do know that?"

Oftentimes systems that appear to be working are not. You never really know how well you accounting and budgeting are working until you have limited funds. You can't tell how great your sales team is if you have a product that everyone wants. You can't tell how well your marketing is working if you have a sales team that is knocking down doors and taking names. You can't tell how well your reserve gas tank works if you've never actually run out of gas.

Oftentimes we think things are working when in fact they've never really been tried.

In the case of last week's databases system, I decided to check the log files of one of the applications I'd been assured had been working to see whether or not it really had. When I got on the system and looked at the files, low-and-behold, they weren't there. It turned out that rather than appending new information to the files, the software had been overwriting old data with new data. Of course, this told me four things:
  1. The software had never been working properly.
  2. Even if it had been working, there was no evidence.
  3. No one had been checking the log files to verify the software was working.
  4. The guys who'd assured me that it was working were unreliable
That little tidbit changed everything.

Fix It
Whether you want to lose weight or become a better musician or get your car to run on cold mornings or keep your computer from crashing late at night, the first rule in figuring out what's wrong is:
fix what you know is broken.
None of the things that you see are broken may end up being the root cause of your problem. However, the cumulative effect of several things broken at once can cause problems that no one thing would and oftentimes, the tiny, unrelated thing that is broken turns out to be the basis for all the other problems.

Experts
The problem with many experts is that they're recognized as experts because of what they know and not because of what they can do. Experts tend towards the look-up-and-remember approach and they can be great resources in regard to information. However, they're often terrible at diagnostics, and in particular, in diagnosing something they've never seen before.

What often happens is that the expert will recaste the problem into something that she knows how to fix, rather than adjust her tool set to what's really going on.

Diagnostic Monday
So, how about steering clear of google today, avoiding asking for answers and tuning up those latent diagnostic skills? Take something that you've been trying to figure out for a while, dig out your virtual Deerstalker cap and Calabash pipe and perform a little diagnostic detective work of your own.

Happy Monday,
Teflon

No comments:

Post a Comment

Read, smile, think and post a message to let us know how this article inspired you...