Kosher Visualization

I’m working on my visualization presentation (OOW unconference, October 12, 4PM – don’t miss it!), and one of the topics I keep rethinking is how to present results of research in a visual way. Especially when the report or presentation is for non-technical management.

It is perfectly easy to take true data and arrange it on a chart in a way that “proves” whatever it is you want to show. But is it wrong? Is there a one true way to display data and everything else is a lie and a distortion?

Lets look at a handy example: http://www.cunningham.me.uk/wordpress/2007/07/11/how-to-lie-with-statistics-as-shown-by-the-bbc/

In the example (look at the graphs with the red lines), scientists measured a temperature increase of 0.5 degrees over a period of 30 years. The blogger thinks that the first chart lies – because they make a tiny change look scary by changing the scale of the chart. He then shows the “correct” chart where the scale is changed to the point that the temperature increase is barely noticeable.

But is it really that straight forward? Another point of view can be that temperature change of 0.5 degrees over 30 years is a huge deal, and is indeed scary and the graph was scaled to make the correct scientific view more visible. By rescaling the graph you are actually obscuring an important truth and misleading the audience.

What is the truth? I’m just a simple DBA, I’ve no idea about global warming.

But when I do research about a performance issue and then I write a report about the results of my research, and I use charts to demonstrate the important points in my results – I find it legitimate to scale the graphs in a way that makes the important points as clear as possible. If my graphs don’t demonstrate my points in the clearest way possible then I’m doing a bad job.

However, to keep myself ethical, I follow few rules about these modifications:

  1. You are 100% sure, to the best of your knowledge and research, that the point you are making is indeed correct. You are not allowed to hide data just because you did not do a very good job at collecting or analyzing it.
  2. You mention the modification in the report or presentation. You make the original data available to anyone who wants to verify your results.
  3. You have very good reasons for the modifications you did and you feel comfortable presenting them to anyone who questions your charts.
  4. You will be extra careful when rescaling data that is displayed as two dimensional shapes, and make sure that the proportions between the rescaled areas indeed reflect the proportions of the data. Because in 2D small changes are doubled.

You’ll notice that my advice is somewhat subjective – that because I don’t really see an objective way to differentiate between “highlighting an important truth” and “making mountains out of molehills”. You did the research, you know if 0.002ms increase in storage network round-trip time is a big deal or not, and you should decide how to display it. Obviously, if you manage to find a clear and unquestionable way to display your results, so much the better.


10 Comments on “Kosher Visualization”

  1. Brett Schroeder says:

    To get some answers to questions, and just how to “make better graphs” I suggest reading some of Edward Tufte’s books. Start with The Visual Display of Quantitative Information.

    • prodlife says:

      You know, I have some kind of allergic reaction to Tufte’s book. His examples are amazingly beautiful, which is exactly the problem – Its too beautiful. I’m a just a simple DBA, not a designer.

      I use Cleveland’s visualization book. It is about visualization techniques for scientists – his examples are all from various research areas and the charts are created from straight forward SPSS. So I find it more practical.

  2. Jorge Camoes says:

    It’s commonly accepted that you shouldn’t change the scale of a column chart (should start at zero) but you can do it with line charts, scatter plots, etc.

    I usually follow this rule to define the scale: minimum value x 80%; maximum value x 120%. Then I make some adjustments (start at 100 instead of 97, for example).

    Cunningham’s charts have a very low resolution and really don’t make any sense because zero is only theoretical.

    In yor example, an increase of 0.5 degrees over a period of 30 years looks irrelevant to the layman’s eye. So you have to add context (what are the consequences of current trends?). What happens at the global scale when temperature increases by 0.5 degrees? Will the pole start melting?

    The readers must see the patterns (it’s all about patterns) and compare them to other patterns and thresholds.

  3. Jorge Camoes says:

    Your readers may be interested in a post I wrote about how to avoid lying with charts.

  4. Gary says:

    But with temperatures, the ‘zero’ is just as arbitrary as any other value. Its the freezing point of pure water, but so what ? Has that got anything to do with the data you are presenting ? Zero Fahrenheit is closer to the freezing point of sea/salt water which might make some sense when speaking of global warming.

    • prodlife says:

      I think its a common practice to start the columns with ‘zero’, show a visual gap or tear in the columns and proceed to start again at whatever value makes sense. Kind of draw the eye to the fact that we are not starting from 0. Not sure if this makes more sense than any other arbitrary starting point.

    • Adi Stav says:

      So not only is zero arbitrary, but it also it depends on your frame of reference. If you graph by Kelvins (where the zero actually means something), you’ll hardly be able to spot even a 10 degree difference. So who’s to tell what’s an important truth and what’s a molehill? We’re all just ants on a ball of mud, and we highlight what *we* think is important.


Leave a comment