Wednesday, June 20, 2012

Bullet Graphs


My last few posts here have been high-level stuff. It's important to think about the big picture. But, it's also important to think about the little picture. So, this week, I want to talk a little bit about just that – little pictures. Graphs, actually. We're making a few changes in how we report some of our results. You'll be seeing these graphs in upcoming projects. So, I wanted to take a little time and describe how they work and what patterns we can see in them. They have a lot more power than it would seem, for something this simple.
The Definition
The bullet graph was developed by Stephen Few. Few is a consultant in the area of business intelligence and data visualization. What he was trying to build was a way to visualize a data set that would play nice with how we see and perceive information and still pack a lot of information into a small space. (The circular dials that are so common in the dashboard metaphor, pretty much, do neither of these things.) His web site (http://www.perceptualedge.com) is well worth some time to read.
But, I digress. Let's talk some more about bullet charts. I've cooked up a few examples of different things that we see in the charts the way that we present them. The data is properly randomized, so we'll get realistic fluctuations, instead of pristine curves.

There are 4 primary elements to a bullet chart.
1.     The scale. For the bullet graph to work, there'll always be a numeric scale attached. Usually it starts at 0. But, for our examples today, it runs from 3-18.
2.     The background. The background on a bullet chart is important. That is where we encode the "good/better/best" sort of information that we need to make sense of it. These graphs are tuned to be readable, even when printed. The light grey color is SLA. The dark grey color is the goal for that measurement. On this graph, and all of today's examples, we have an "SLA" of 16 and a goal of 5.
3.     The measure we are interested in. That would be the thin, black bar down the middle. Sometimes this bar will be red. We use the black bar for the 95%-ile response time. This is a time when being "outside the box" is not a good thing. In the example above, we would be over SLA.
4.     The secondary measure. This is the white diamond – the level bubble – floating in the middle of the black bar. We put the median on the bubble.
That seems simple enough, right? The fact that it is so simple is one of the things that I like about this particular chart. You can process "good", "bad" or "say what?" in a blink, without having to parse monstrous tables of data. One of the great ironies of data visualization is the notion of "pretty". People fret over "why do I need to spend time making a pretty chart?" We don't. Not really. What we need is simple and clear. It just happens that things that our brains scan as "clear", they also scan as "pretty". Go figure.
Patterns
Let's take a look at a couple of examples and see what else we can see in these little charts.

Our first chart, Data Set A, is pretty typical in a number of ways. We have our goal and requirement bands in there. The 95%-ile bar, though, is right at the requirement. That is something that we would want to investigate further. The other thing to notice is that the median bubble is riding right about the middle of the chart. Let's take a look at a histogram of the data along those same buckets and see how it looks.

Here, I've taken the histogram of the data that went into building the bullet chart and aligned it with the bullet chart itself. The green and red lines just help us to see where the two graphs align. We know that half of the data has to be below the bubble (and the green line) – that's the definition of median. I promised random data, with all of its warts and bumps, and here it is. There are a few bumps in the histogram. But, overall, it has the sort of shape that we would expect from a normal distribution – big hump in the middle, median line bisects the hump, tails on both ends. We have a few items about the 95%-ile, just like we would expect. Overall, this is the sort of pattern that you'd expect to see.
So, let's take a look at a couple of not-so-normal patterns.

We've seen this graph before. Right away, we would want to look into a transaction that looked like this because it is over SLA. But, there is something else interesting here as well. Look at the median bubble. See how it is riding way to the right? That's a pattern that warrants some investigation. So, let's put it beside its frequency distribution graph and see what we see.

Now, we can see something more interesting is going on. To begin with, the histogram is skewed to the right. The green line still bisects that graph, it has to. But, our measurements have a definite floor. Nothing measured here was less than 8. Why is that? That sort of pattern is characteristic of some sort of timeout or other failure going on. Find what is causing THAT, and you'll make a big change in this result.
Let's take a look at one more.

Under normal circumstances, a transaction with this profile would not get investigated. It is below SLA, the median looks pretty good. Odds-on chance any reasonable test would turn up much more interesting things to investigate than this. Still, the pattern underlying this graph is something that we see often enough in things that don't meet SLA. So, let's take a look at it and see what we see.

Notice how the curve is skewed to the left. Sure, half to the left of the green line and half to the right. We know that. But, the big grouping is to the left. The ones to the right are "like too little jam spread over too much bread". When we see this sort of pattern, it always means that there is something fundamentally different about the transactions that fell in the big grouping to the left as opposed to the thin group to the right. What's the difference? What is it that is either making the one group clump, or spreading the other group out? If you can find that answer – and it is usually just one variable – then you really have a handle on this transaction and on how to either improve the transaction or improve the test.

No comments:

Post a Comment