December 01, 2010

Data Visualization Fail # 1

The following chart is from Transparency which shows the Corruption Perceptions Index for 2010.

The highlighted blue circles are the regions wherein the color difference is most minimal. I dont think using very minor color gradients in visualization is going to help much in understanding the spread. A first look at the map, demarcates the world in to American(with Oz and North EU) and non-American countries as only Yellow and red are the prominent colors. Regions that are highlighted , using blue circle, do have a variation in the perception index, but to the proximity of the gradient our eye fails to catch them. The best example is probably, South Korea(5.4) and Japan(7.8) - there is indeed a problem here. These two countries  fall in two different slabs , and there is a 30% difference in value between them, but on a cursory glance we fail to notice this difference. Similarly, countries within African, European and Asian continents cluster together,
if there is a small country which stands out from the group, then this viz fails to show that.

Remember that color is never seen alone, color is always seen based on what is surrounding it. An effective use of  color will group related items and command attention inproportion to importance. Colors are the most neglected  and also the most abused factor in any chart. Colors also show the intention of vizualization.

This is a typical case of "Chartjunk due to colors"  . One might opine that one can always zoom in and see the difference, but let me show you what happens when i zoom in on Eastern EU.

But, the viz also scores some marks on some other positive aspects:
- An excellant legend (though i would have preferred the words "Very Clean" and "Highly Corrupt" to be placed more closer to the values)
- Displays the countryname/index value on hovering over the country.
- Prescence of a table below the chart which ranks the countries - and this table can be sorted/searched. (the rank information here would have been an icing)
- A multiplication factor of 10 on the Index would have been easy to read ( its easy to read 10-20, 20-30, than 1-2,2-3)
