Research on designing effective charts and diagrams

Source: Excel charts meet William Playfair : Excelcharts.com, Jorge Camoes, Dec. 2011

I am often asked for advice on the usability, design, and accessibility aspects of charts here is some research I did into the subject. Edward Tufte is a leading thinker on the use of data graphics so a lot of the information here is based on his work.

Note: I have used the Australian spelling of colour rather than color, organisation rather than organization, etc.

Visual information design

“Meaningful quantitative information always involves relationships”

Source: Designing effective tables and graphs | Perceptual Edge, Steven Few, 2012 | 433 KB |PDF

“The human visual system has tremendous capacity to recognize patterns – but only if they are
presented in certain ways. And it has a tendency to misinterpret or completely miss information if it is presented in a non-intuitive way”

Source: Effective communication through visual design: tables and charts | Strategy Institute | Rebecca Carr, Mary Harrington, 2011 | 688 KB | PDF

What is a graphic

“Data graphics visually display measured quantities by means of the combined use of points, lines, a
coordinate system, numbers, symbols, words, shading, and colour.”
Source: The Visual display of quantitative information, 2nd ed. / Edward R. Tufte, 2001

Benefit of a diagram

A plan, a sketch, drawing, outline, not necessarily representational, designed to demonstrate or explain something or clarify the relationship existing between the parts of the whole.

Source:100 Diagrams that changed the world | Brainpickings | Maria Popova, Nov 2015

Visually displayed information tips

Should do the following:

  • Enforce visual comparisons
  • Show causality
  • Show multiple variables
  • Integrate text, graphics, and data in one display
  • Ensure the content’s quality, relevance, and integrity
  • Show things adjacent in space, not stacked in time
  • Don’t dequantify quantifiable data

Source: p. 425, Visual information design – ‘grand principles’ from Edward Tufte cited in About Face: essentials of interaction design, 4th ed., Cooper Et. al. 2014

Graphical displays should:

“Show the data

  • Induce the viewer to think about the substance rather than about methodology, graphic design,
    the technology… or something else…
  • Make large data sets coherent
  • Encourage the eye to compare different pieces of data
  • Reveal the data at several levels of detail, from broad overview to fine structure
  • Serve a reasonably clear purpose: description, exploration, tabulation, or decoration
  • Be closely integrated with the statistical and verbal descriptions of a data set [helps accessibility too!]”

Source: Tufte, 2001

Five principles in the theory of data graphics

…[These should] yield a series of design options through cycles of graphical revision and editing.

  • Above all else show the data
  • Maximise the data-ink ratio
  • Erase non-data-ink
  • Erase redundant data-ink
  • Revise and edit

The points on data ink remain relevant in the digital context particularly in the age of responsive design. Just read them as pixels rather than ink.

Source: p. 105, Tufte 2001

Tables and Graphs

“Tables usually outperform graphics in reporting small data sets of 20 numbers or less. The special
power of graphics comes in the display of large data sets” –p. 56, Tufte 2001

A table works best when:

  • It is used to look up individual values
  • The values must be expressed precisely

A graph works best when:

  • the message is contained in the shape of the data (patterns, trends, exceptions to the norm)
  • entire sets of values must be compared

Types of graphics

Data maps

You know just a map with data on it.

Time-series

“Time-series displays are at their best for big data sets with real variability. Why waste the power of data graphics on simple linear changes, which can usually be better summarized in one or two numbers”

p. 30, Tufte 2001

“The problem with time-series is that the simple passage of time is not a good explanatory variable:
descriptive chronology is not causal explanation”

p. 37, Tufte 2001

“Time-series plots can be moved toward causal explanation by smuggling additional variables into the
graphic design”
p. 38, Tufte 2001

Before-after time series

A very popular type.

Don’t use pie charts

Pie charts are a bad way to present most information

“Given their low data density and failure to order numbers along a visual dimension, pie charts should never be used.” p. 178, Tufte 2001

For even more on different types of graphics

Google developer guide to charts, Sankey chart entry

  • The Google Developers guide to interactive charts covered a broad range including the sankey which is used a lot for conversion funnels in e-commerce data analytics.

New Charts and Visualisation Types: Horizontal Bar, Area and Mosaics | Medium | Silk Stories, May 2015

  • the Silk stories data visualisation tool covers things like the horizontal bar area and mosaics.

Making charts with css | CSS Tricks, Robin Rendle, August 2015 - Some good tips here if you want to deliver charts in CSS.

The friendly data graphic
Friendly Unfriendly
Words are spelled out, mysterious and elaborate encoding avoided Abbreviations abound, requiring the viewer to sort through text to decode abbreviations
Words run from left to right, the usual direction for reading occidental languages. Words run vertically, particularly along the Y-axis; words run in several different directions
Little messages help explain data Graphic is cryptic, requires repeated references to scattered text
Elaborately encoded shadings, cross hatching, and colours are avoided; instead, labels are placed on the graphic itself; no legend is required Obscure codings require going back and forth between legend and graphic
Graphic attracts viewer, provokes curiosity Graphic is repellent, filled with chartjunk
Colours, if used, are chosen so that the colour-deficient and colour-blind (5-10 percent of viewers) can make sense of the graphic (blue can be distinguished from other colours by most colour-deficient people) Design insensitive to color-deficient viewers; red and green used for essential contrasts
Type is clear, precise, modest, lettering may be done by hand Type is clotted, overbearing
Type is upper-and-lower-case, with serifs Type is all capitals, sans serif

Table from p. 183, Tufte 2001

Appendix A – Further detail, selected notes on Tufte 2001

Graphics should be directly proportional to numbers

“The representation of numbers, as physically measured on the surface of the graphic itself, should be
directly proportional to the numerical quantities represented.”

Use clear labelling

“Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity.”

p. 56, Tufte 2001

Calculation of the ‘lie factor’ of a graphic

[You can calculate the lie factor of a graphic by looking at proportional representation in the graphic]

“Show data variation, not design variation. Design variation corrupts [the] display”

p. 61, Tufte 2001

Money over time

“The only way to think clearly about money over time is to make comparisons using inflation adjusted
units of money”

p. 63, Tufte 2001

“Probably the most frequently printed graphic, other than the daily weather map and stock market
trend line, is the display of government spending and debt over the years.

p.65, Tufte 2001

“The number of information carrying (variable) dimensions depicted should not exceed the number of
dimensions in the data.”

p. 71, Tufte 2001

“Context is essential for graphical integrity. To be truthful and revealing, data graphics must bear on the question at the heart of quantitative thinking: ‘Compared to what?’"

p.74, Tufte 2001

“The lies [in mainstream media graphics] are systemic and quite predictable, nearly always exaggerating the rate of recent change.”

“It is the special character of numbers that they have a magnitude as well as an order; numbers measure quantity”

p.76, Tufte 2001

Relational graphics

“Such a design links two or more variables but is not a time-series or a map. Relational graphics are
essential to competent statistical analysis since they confront statements about cause and effect with
evidence, showing how one variable affects another.”

p. 82, Tufte 2001

“Relational graphics are well suited to a role as an ‘explanatory graphic’”

p. 84, Tufte 2001

Capabilities required to produce good data graphics

“The conditions under which many data graphics are produced – the lack of substantive and quantitative skills of the illustrators, dislike of quantitative evidence, and contempt for the intelligence of the audience – guarantee graphic mediocrity. These conditions engender graphics that (1) lie; (2) employ only the simplest designs, often unstandardise time-series based on a small handful of data points; and (3) miss the real news actually in the data.”

“Graphical competence demands three quite different skills: the substantive, statistical, and artistic.”

Usually the focus is on artistic. [However]…Substantive and quantitative expertise must also participate in the design of data graphics, at least if statistical integrity and graphical sophistication are to be achieved.”

p. 87, Tufte 2001

“Essentially statistical graphics are instruments to help people reason about quantitative information”

p. 91, Tufte 2001

“Every bit of ink on a graphic requires a reason… Erase not data-ink within reason”

p. 96, Tufte 2001

Chartjunk defined:

Chartjunk has three main types

  • “unintended optical art
  • The dreaded grid
  • The self promoting graphical duck”

p. 107, Tufte 2001

“…Graphics are almost always going to improve as they go through editing, revision, and testing against
different design options.”

“…Graphics should be as intelligent and sophisticated as the accompanying text.”

p. 136, Tufte 2001