Designing Data Visualizations
to Successfully Tell a Story

Foundations of Data Visualization

Cédric Scherer // posit::conf // September 2023

Data Visualization

is any graphical representation
of information and data.

Data Visualization

helps to amplify cognition, gain insights, discover, explain, and make decisions.

Data Visualization

converts information into visual forms
as quantifiable features.

Data Visualization

is part art and part science.
.



Source: eazybi


Source: Ranganathan et al. 2014

Source: datameer.com


Source: “Perpetual Plastic” by Liina Klauss, Skye Morét & Moritz Stefaner

Source: “Patchwork Kingdoms” by Nadieh Bremer


“Carte figurative des pertes successives en hommes de l’Armée Française dans la campagne de Russie 1812–1813” by Charles Joseph Minard (1869)

“Carte figurative des pertes successives en hommes de l’Armée qu’Annibal conduisit d’Espagne en Italie en traversant les Gaules (selon Polybe)” (top) and “Carte figurative des pertes successives en hommes de l’Armée Française dans la campagne de Russie 1812–1813” (bottom) by Charles Joseph Minard (1869)

  • shows the force levels of the armies of Hannibal (218 BC) and Napoleon (1812-1813), respectively
  • some data visualization practioners call it (one of) the best statistical drawings ever created


“Carte figurative et approximative des quantités de coton brut importées en Europe en 1858, en 1864 et en 1865” by Charles Joseph Minard (1866)

“Carte figurative et approximative des poids des bestiaux venus á Paris sur les chemins de fer en 1862” by Charles Joseph Minard (1864)

“Tableau figuratif du mouvement commercial du Canal du Centre en 1844” by Charles Joseph Minard (1845)

“Tableau figuratif du mouvement commercial du Canal du Centre en 1844” by Charles Joseph Minard (1845)

Exercise


If the year is a circle—where’s March and December in your mind?


  • Imagine you had to create a polar representation of the months.
  • Draw a circle and indicate the position of March and December.
  • Use an arrow to illustrate the direciton of time.
  • Compare the results with your neighbors.


Wheel diagram of 76,922 placements of the months of December and March on the circumference of an empty circle.
Graphics by Henrik Lied at NRKbeta. Laeng & Hofseth, Front Psychol. 2019


Proportion of respondents choosing opposite direction of time on the year’s wheel.
Graphics by Vidar Kvien, NRK. Laeng & Hofseth, Front Psychol. 2019


“Diagram of the causes and mortality in the army in the East” (a so-called coxcomb diagram) by Florence Nightingale (1858)


“Relative mortality of the army at home and of the English male population at corresponding ages” by Florence Nightingale (1858)

Visualize Your Data

… make both calculations and graphs.
Both sorts of output should be studied;
each will contribute to understanding.


F. J. Anscombe (1973)


Anscombe’s Quartet


Source: Matejka & Fitzmaurice (2017)

Visualize Your Data!



“When Dmitry Kobak and Sergey Shpilkin […] analysed the results, they found that an unusually high number of turnout and vote-share results were multiples of five (eg, 50%, 55%, 60%), a tell-tale sign of manipulation.”

Visualize Your Data!



“When Dmitry Kobak and Sergey Shpilkin […] analysed the results, they found that an unusually high number of turnout and vote-share results were multiples of five (eg, 50%, 55%, 60%), a tell-tale sign of manipulation.”

A good data visualization can mean the difference between success and failure.


  • Communicating complex findings and phenomena
  • Raising money for an organization, event or department
  • Presenting at a board or conference
  • Helping businesses and institutions to make informed decisions
  • Providing guidance for improvement
  • Getting your point across!





Good vs Bad

What Makes a Good Data Visualization?


  → Integrity (information)

  → Story (interestingness)

  → Goal (usefulness)

  → Visual Form (beauty)

Data Integrity

and Potential Pitfalls…

Data Pitfalls

  • epistemic errors — how we think about data
  • technical traps — how we process data
  • mathematical miscues — how we calculate data
  • statistical slipups — how we compare data
  • analytical aberrations — how we analyze data
  • graphical gaffes — how we visualize data
  • design dangers — how we dress up data

Data Pitfalls

  • epistemic errors — how we think about data
  • technical traps — how we process data
  • mathematical miscues — how we calculate data
  • statistical slipups — how we compare data
  • analytical aberrations — how we analyze data
  • graphical gaffes — how we visualize data
  • design dangers — how we dress up data

Integrity of Information


 — data quality:
  → guesstimation, precision, and failures
  → miscalculations and errors
  → incomplete data and missing values
  → summaries and aggregations


 — only a subset:
  → not crime but reported crime*
  → historical or present state


* or rats, UFO sightings, …

Our data is never a perfect
reflection of the real world.



The best use of data is to
teach us what isn’t true.




Don’t formulate a single statement:

“The swan is white.”



Confront yourself with a falsifiable universal statement:

“All swans are white.”

Context

Typology of Information Graphics


Is the information conceptual or measurable?

 → Type of information: depict information schematically <> convert information into visual forms


Is the aim to explore or to explain the information?

 → Purpose of the graphic: facilitate discovery <> communicate information

Source: Koponen & Hildén, “Data Visualization Handbook” (2020), page 25

Visualizations can be designed and experienced in various ways, by people of various backgrounds, and in various circumstances.

That’s why reflecting on the purpose of a visualization is paramount before we design it—or before we critique it.


Alberto Cairo


“Vertices of Visualization” by Alberto Cairo, personal communication (modified version)


“Vertices of Visualization” by Alberto Cairo, personal communication (modified version)


“Vertices of Visualization” by Alberto Cairo, personal communication (modified version)

Audience (who)

  • To whom are you communicating?
  • What do they already know?
  • What is your position and relationship?

Audience (who)

  • To whom are you communicating?
  • What do they already know?
  • What is your position and relationship?

Content (what)

  • What do you want them to know or do?
  • How will you communicate with them?
  • What tone do you want your communication to set?




Scheme by Andy Kirk (modified)




Scheme by Andy Kirk (modified)




Scheme by Andy Kirk (modified)




Scheme by Andy Kirk (modified)

Audience (who)

  • To whom are you communicating?
  • What do they already know?
  • What is your position and relationship?

Content (what)

  • What do you want them to know or do?
  • How will you communicate with them?
  • What tone do you want your communication to set?

Evidence (how)

  • What data is available to make my point?

Context: Prepare Yourself


  • What is the one key massage they should take home?
  • What background information is essential? What’s irrelevant?
  • What are potential biases of (some of) the audience?
  • What factors could weaken your case? Can we address them proactively?

Exercise

Exercise

Take a closer look at the following three visualizations.

  • Address the following questions:
    • What is the main message you learn from the graphic?
    • What is the purpose of the visualization?
    • Who is the audience?
  • Rate the graphics according to the four levels:
    information, story, goal, and visual form.
  • Collect three things you notice, no matter if positive or negative.
    • How could you fix the details you dislike?
  • Team up with your neighbor(s) and discuss your findings.

Source: “Yearly Fluctuations in Area of Arctic Covered by Ice” by Derek Watkins (New York Times)

Source: Dr. Robert Rohde (Tweet)