Kosher VisualizationPosted: September 23, 2009
I’m working on my visualization presentation (OOW unconference, October 12, 4PM – don’t miss it!), and one of the topics I keep rethinking is how to present results of research in a visual way. Especially when the report or presentation is for non-technical management.
It is perfectly easy to take true data and arrange it on a chart in a way that “proves” whatever it is you want to show. But is it wrong? Is there a one true way to display data and everything else is a lie and a distortion?
Lets look at a handy example: http://www.cunningham.me.uk/wordpress/2007/07/11/how-to-lie-with-statistics-as-shown-by-the-bbc/
In the example (look at the graphs with the red lines), scientists measured a temperature increase of 0.5 degrees over a period of 30 years. The blogger thinks that the first chart lies – because they make a tiny change look scary by changing the scale of the chart. He then shows the “correct” chart where the scale is changed to the point that the temperature increase is barely noticeable.
But is it really that straight forward? Another point of view can be that temperature change of 0.5 degrees over 30 years is a huge deal, and is indeed scary and the graph was scaled to make the correct scientific view more visible. By rescaling the graph you are actually obscuring an important truth and misleading the audience.
What is the truth? I’m just a simple DBA, I’ve no idea about global warming.
But when I do research about a performance issue and then I write a report about the results of my research, and I use charts to demonstrate the important points in my results – I find it legitimate to scale the graphs in a way that makes the important points as clear as possible. If my graphs don’t demonstrate my points in the clearest way possible then I’m doing a bad job.
However, to keep myself ethical, I follow few rules about these modifications:
- You are 100% sure, to the best of your knowledge and research, that the point you are making is indeed correct. You are not allowed to hide data just because you did not do a very good job at collecting or analyzing it.
- You mention the modification in the report or presentation. You make the original data available to anyone who wants to verify your results.
- You have very good reasons for the modifications you did and you feel comfortable presenting them to anyone who questions your charts.
- You will be extra careful when rescaling data that is displayed as two dimensional shapes, and make sure that the proportions between the rescaled areas indeed reflect the proportions of the data. Because in 2D small changes are doubled.
You’ll notice that my advice is somewhat subjective – that because I don’t really see an objective way to differentiate between “highlighting an important truth” and “making mountains out of molehills”. You did the research, you know if 0.002ms increase in storage network round-trip time is a big deal or not, and you should decide how to display it. Obviously, if you manage to find a clear and unquestionable way to display your results, so much the better.