Data Representations
23 Jan 2012 - Dr Anton Gerdelan -

Why Visualise?

Visualisation Options

How to Select Visualisations

clevelands hierarchy

How to Select Visualisations

types of data and graphs


char types

safe options

  • line charts
  • bar chart
  • scatter plot
  • keep-it-simple
  • reduce clutter
  • annotate
  • split into two visualisations

trends and continuous quantitative data

torques at gears
source: Willys Jeep manual, ~1940

visualising categories - qualitative data

error measurements and uncertainty

error cartoon
  • scientific measurements are repeated n times
  • we must represent uncertainty visually
  • use standard error or standard deviation for each point or bar
  • remember: sample standard deviation
  • stddev
  • and standard error is std dev divided by square root of n
  • .: bigger n → small standard error
  • if an error bar is large: collect more data

standard error and standard deviation

  • we can plot error bars with SE or SD, but what is the difference?
  • normally bars are ±SD or ±SE but why might we double the size?
  • why is this a valuable visualisation tool?
  • we represent this visually on scatter, line, and bar charts with y-axis and sometimes x-axis error bars.
  • what text must go in the caption for the error bars to make sense to the reader?
normal distribution with std devs
fig: normal distribution with standard deviations

standard error distribution
fig: normal distribution with standard error


half way...

interpreting error bars

  • SE bars → 68% confidence that the true mean lies inside the error bars
  • 2X SD bars are called the 95% confidence interval
  • with SE bars two points are significantly different if there is at least 1/2 bar gap between the error bars.
  • 95% confidence interval bars are longer. Less than 1/4 bar of overlap corresponds to P < 0.05
  • always explain the statistics used in the text or caption
    • n
    • if standard deviation or standard error is used
    • discuss any unsual points or trends
sig diff using std error
any closer and not significantly different
image source:

maps and "snail trails"

slam motion chartreynolds' corridorsgame lines
Fig (left): a map built by a robot using the SLAM algorithm. location and facings at intervals are shown with red. (Oxford Mobile Robotics Research Group). Fig (centre): 3 simulation runs from Craig Reynolds' - Evolution of Corridor Following Behavior in a Noisy World, 1994. Fig (right): a game character motion path

heat maps

heat map traffic noise

heat maps

3d heat peaks robot heat map game heat map
Fig (left): path-finding with A* algorithm. Fig (centre): regions of undesirability around opponent soccer robot. Fig (right): 3D game activity

toolkit requirements

evaluate these charts

philips' bad chart

evaluate these charts

how to

evaluate these charts

explopie2bad chart explopie5 bad chart explopie4 bad chart explopie3 bad chart
explopie2bad chart explopie5 bad chart

a good data representation has

  1. can the viewer determine the difference, in units, between data points in your visualisation?
  2. have you provided enough information so that another scientist can visualise their own data set and compare results with you?


further reading