Engaging and Beautiful Data Visualizations with ggplot2

Fundamentals & Workflows

Cédric Scherer // posit::conf // September 2023

{ggplot2}

The ggplot2 hex logo.


{ggplot2} is a system for declaratively creating graphics,
based on “The Grammar of Graphics” (Wilkinson 2005).

You provide the data, tell {ggplot2} how to map variables to aesthetics,
what graphical primitives to use, and it takes care of the details.

Advantages of {ggplot2}

  • consistent underlying “grammar of graphics” (Wilkinson 2005)
  • very flexible, layered plot specification
  • theme system for polishing plot appearance
  • lots of additional functionality thanks to extensions
  • active and helpful community

The Components of a ggplot


Component Function Explanation
Data ggplot(data)          The raw data that you want to visualize (initialising a plot).
Aesthetics           aes() The mapping between variables and visual properties.
Geometries geom_*() The geometric shape of a layer representing the data.
A collection of the versatility of ggplot2 to create basic graphs. All of them use only data, aesthetics, and layers with the defaults of ggplot2.

ggplot2 examples featured on ggplot2.tidyverse.org

The Components of a ggplot


Component Function Explanation
Data ggplot(data)          The raw data that you want to visualize (initializing a plot).
Aesthetics           aes() The mapping between variables and visual properties.
Geometries geom_*() The geometric shape of a layer representing the data.
Statistics stat_*() The statistical transformation of a layer applied to the data.
Scales scale_*() The representation of mapped aesthetic attributes.
Coordinate System coord_*() The transformation to map data coordinates into the plot plane.
Facets facet_*() The arrangement of the data into a set of small multiples.
Visual Themes theme() | theme_*() The overall visual defaults of non-data elements of the graphic.
Allison Horsts monster illustration of explorative plotting with ggplot2.

Illustration by Allison Horst

Allison Horsts monster illustration of building data masterpiece with ggplot2 featuring a little Picasso monster :)

Illustration by Allison Horst

The {ggplot2} Showcase


A multi-plot panel of various data visualizations created by the BBC teams.

Collection of Graphics from the BBC R Cookbook


A multi-plot panel of various data visualizations created by the BBC teams.

Collection of Graphics from the BBC R Cookbook

Distribution of coffee bean ratings by the Coffee Quality Institute for countries with 25 or more reviews (up to 2018). Distributions are shown as dot plots and multiple interval stripes.

“Bill Dimensions of Brush-Tailed Penguins”

rea graph of number of Netflix original series over time, annotated with images and labels.

“Netflix Content Explosion” by Tanya Shapiro

The raincloud chart showing the distributions of normalized speech rates (dark pink) and information rates (dark lime green) across language families.

My reinterpreted The Economist graphic

The raincloud chart showing the distributions of normalized speech rates (dark pink) and information rates (dark lime green) across language families.

“Not My Cup of Coffee”

Two circular hierarchical bar plots showing the carbon footprint 2018 for food consumption and CO2 per continent and country.

“Food Carbon Footprint Index 2018”

A circular tree showing the programing languages used in CRAN packages, with nodes being grouped by languages and package names, sized by number of lines.

“Popular Programming Languages in CRAN Packages” by Torsten Sprenger

Streamgraphs showing the appearance of the most common X-Men characters (Wovlerine, Magneto, Nightcrawler, Storm & Gambit) during the so-called Claremont Run. Chris Claremont is a famous American comic book writer who was in charge of the Uncanny X-Men comic book series from 1975–1991. During that time he developed complex literary themes and strong female characters into superhero comics, and turning the X-Men into one of Marvel's most popular series.

“Appearance of X-Men Characters”

A facet of barcodes showing location quotients from artists in the US by type and race.

“Artists in the US” by Lee Olney


A facet of shots and goals of the Bundesliga football clubs in the season 2019/20.

My Contribution to the SWDchallenge “Small Multiples”

A gridded map of Europe showing horizontal stacked bars of energy production per country over time (each bar represents share among one year from 2016 to 2018).

“European Energy Generation” by Jack Davison

A grid map using moon charts for all 297 electoral districts which show the share of the wnning party in second votesc during the German election in 2021.

Moon Charts as a Tile Grid Map showing the 2nd Vote Results from the German Election 2021

A spatial map of cheetah movement and their hotspot behaviour in Namibia.

Our Winning Contribution to the BES MoveMap Contest

A spatial map of income and inequality shown with a bivariate color palette; alpine regions have a hillshading effect.

Bivariate Choropleth x Hillshade Map by Timo Gossenbacher

A set of small multiples using pixelated encodings of certain elements in Bob Ross' paintings.

Pixel Art by Georgios Karamanis

Two artworks by Thomas Pedersen, completely generated in R with ggplot2 (and pure magic).

Generative Art by Thomas Lin Pedersen

A Walk-Through Example

The Data Set

Bike sharing counts in London, UK, powered by TfL Open Data

  • covers the years 2015 and 2016
  • incl. weather data acquired from freemeteo.com
  • prepared by Hristo Mavrodiev for Kaggle
  • further modification by myself

The Data Set

library(readr)
library(ggplot2)

bikes <-
  read_csv(
    here::here("data", "london-bikes.csv"),
    col_types = "Dcfffilllddddc"
  )

The Data Set

bikes
# A tibble: 1,454 × 14
   date       day_night year  month season count is_workday is_weekend is_holiday  temp temp_feel humidity wind_speed weather_type    
   <date>     <chr>     <fct> <fct> <fct>  <int> <lgl>      <lgl>      <lgl>      <dbl>     <dbl>    <dbl>      <dbl> <chr>           
 1 2015-01-04 day       2015  1     3       6830 FALSE      TRUE       FALSE       2.17     -0.75     95.2      10.4  broken clouds   
 2 2015-01-04 night     2015  1     3       2404 FALSE      TRUE       FALSE       2.79      2.04     93.4       4.58 clear           
 3 2015-01-05 day       2015  1     3      14763 TRUE       FALSE      FALSE       8.96      7.71     81.1       8.67 broken clouds   
 4 2015-01-05 night     2015  1     3       5609 TRUE       FALSE      FALSE       7.12      5.71     79.5       9.04 cloudy          
 5 2015-01-06 day       2015  1     3      14501 TRUE       FALSE      FALSE       9         6.46     80.2      19.2  broken clouds   
 6 2015-01-06 night     2015  1     3       6112 TRUE       FALSE      FALSE       6.71      4.21     77.6      12.8  clear           
 7 2015-01-07 day       2015  1     3      16358 TRUE       FALSE      FALSE       8.17      5.08     75.2      21.2  scattered clouds
 8 2015-01-07 night     2015  1     3       4706 TRUE       FALSE      FALSE       6.68      3.86     81.3      18.1  clear           
 9 2015-01-08 day       2015  1     3       9971 TRUE       FALSE      FALSE       9.46      7.12     79.4      18.8  scattered clouds
10 2015-01-08 night     2015  1     3       5630 TRUE       FALSE      FALSE      10.0       8.46     79.2      22.2  clear           
# ℹ 1,444 more rows

The Data Set

Variable Description Class
date Date encoded as `YYYY-MM-DD` date
day_night `day` (6:00am–5:59pm) or `night` (6:00pm–5:59am) character
year `2015` or `2016` factor
month `1` (January) to `12` (December) factor
season `0` (spring), `1` (summer), `2` (autumn), or `3` (winter) factor
count Sum of reported bikes rented integer
is_workday `TRUE` being Monday to Friday and no bank holiday logical
is_weekend `TRUE` being Saturday or Sunday logical
is_holiday `TRUE` being a bank holiday in the UK logical
temp Average air temperature (°C) double
temp_feel Average feels like temperature (°C) double
humidity Average air humidity (%) double
wind_speed Average wind speed (km/h) double
weather_type Most common weather type character

A Default ggplot

# scatter plot of plot bikes$count versus bikes$temp_feel
ggplot(data = bikes) +              # initial call + data
  aes(x = temp_feel, y = count) +   # aesthetics
  geom_point()                      # geometric layer

A Default ggplot

# scatter plot of plot bikes$count versus bikes$temp_feel
ggplot(bikes, aes(x = temp_feel, y = count)) +
  geom_point()

Combine Layers

ggplot(bikes, aes(x = temp_feel, y = count)) +
  geom_point() + 
  # add a GAM smoothing
  stat_smooth() # also: geom_smooth()

Mapping Aesthetics

ggplot(bikes, aes(x = temp_feel, y = count, color = day_night)) + 
  geom_point() + 
  stat_smooth()

Mapping Aesthetics

ggplot(bikes, aes(x = temp_feel, y = count)) + 
  # color mapping only applied to points
  geom_point(aes(color = day_night)) + 
  # invisible grouping to create two trend lines
  stat_smooth(aes(group = day_night))

Setting Properties

ggplot(bikes, aes(x = temp_feel, y = count)) + 
  geom_point(
    aes(color = day_night), 
    # setting larger points with 50% opacity
    alpha = .5, size = 1.5
  ) + 
  stat_smooth(
    aes(group = day_night), 
    # use linear fitting + draw black smoothing lines
    method = "lm", color = "black"
  )

Split into Facets

ggplot(bikes, aes(x = temp_feel, y = count)) + 
  geom_point(
    aes(color = day_night), 
    alpha = .5, size = 1.5
  ) + 
  stat_smooth(
    method = "lm", color = "black"
  ) +
  # small multiples
  facet_wrap(facets = vars(day_night)) # also: ~ day_night

Split into Facets

ggplot(bikes, aes(x = temp_feel, y = count)) + 
  geom_point(
    aes(color = season), 
    alpha = .5, size = 1.5
  ) + 
  stat_smooth(
    method = "lm", color = "black"
  ) +
  # small multiples
  facet_grid(
    rows = vars(day_night), cols = vars(year) # also: day_night ~ year
  )

Free Facets Axes

ggplot(bikes, aes(x = temp_feel, y = count)) + 
  geom_point(
    aes(color = season), 
    alpha = .5, size = 1.5
  ) + 
  stat_smooth(
    method = "lm", color = "black"
  ) +
  facet_grid(
    day_night ~ year, 
    # free y axis range
    scales = "free_y", 
    # scale heights proportionally
    space = "free_y"
  )

Store ggplot

g1 <- 
  ggplot(bikes, aes(x = temp_feel, y = count)) + 
  geom_point(
    aes(color = season), 
    alpha = .5, size = 1.5
  ) + 
  stat_smooth(
    method = "lm", color = "black"
  ) +
  facet_grid(
    day_night ~ year, 
    scales = "free_y", 
    space = "free_y"
  )

Change the Axis Scaling

g1 +
  # x axis
  scale_x_continuous(
    # add °C symbol
    labels = function(x) paste0(x, "°C"), 
    # use 5°C spacing
    breaks = -1:6*5  # also: seq(-5, 30, by = 5)
  )

Change the Axis Scaling

g2 <- g1 +
  # x axis
  scale_x_continuous(
    # add °C symbol
    labels = function(x) paste0(x, "°C"), 
    # use 5°C spacing
    breaks = -1:6*5  # also: seq(-5, 30, by = 5)
  ) +
  # y axis
  scale_y_continuous(
    # add a thousand separator
    labels = scales::label_comma(), 
    # use consistent spacing across rows
    breaks = 0:5*10000
  )

Use a Custom Color Palette

g2 +
  # use a custom color palette for season colors
  scale_color_manual(
    values = c("#6681FE", "#1EC98D", "#F7B01B", "#A26E7C")
  )

Use a Custom Color Palette

# use a named vector for explicit matching
colors <- c(
  `0` = "#1EC98D",
  `1` = "#F7B01B",
  `2` = "#A26E7C",
  `3` = "#6681FE"
)

g2 +
  scale_color_manual(
    values = colors
  )

Adjust Labels and Titles

# use a named vector for explicit matching
colors <- c(
  `0` = "#1EC98D",
  `1` = "#F7B01B",
  `2` = "#A26E7C",
  `3` = "#6681FE"
)

g2 +
  scale_color_manual(
    values = colors,
    # overwrite legend keys
    labels = c("Winter", "Spring", "Summer", "Autumn")
  )

Adjust Labels and Titles

g3 <- g2 +
  scale_color_manual(
    values = colors,
    labels = c("Winter", "Spring", "Summer", "Autumn")
  ) +
  labs(
    # overwrite axis and legend titles
    x = "Average feels-like temperature", y = NULL, color = NULL,
    # add plot title and caption
    title = "Trends of Reported Bike Rents versus Feels-Like Temperature in London",
    caption = "Data: TfL (Transport for London), Jan 2015–Dec 2016"
  )

Apply a Complete Theme

g3 +
  # add theme with a custom font + larger element sizes
  theme_light(
    base_size = 15, base_family = "Spline Sans"
  )

Apply a Complete Theme

g4 <- g3 +
  theme_light(base_size = 15, base_family = "Spline Sans") +
  # theme adjustments
  theme(
    plot.title.position = "plot", # left-align title 
    plot.caption.position = "plot", # right-align caption
    legend.position = "top", # place legend above plot
    plot.title = element_text(face = "bold", size = rel(1.4)), # larger, bold title
    axis.text = element_text(family = "Spline Sans Mono"), # monospaced font for axes
    axis.title.x = element_text( # left-aligned, grey x axis label
      hjust = 0, color = "grey20", margin = margin(t = 12)
    ),
    legend.text = element_text(size = rel(1)), # larger legend labels
    strip.text = element_text(face = "bold", size = rel(1.15)), # larger, bold facet labels
    panel.grid.major.x = element_blank(), # no vertical major lines
    panel.grid.minor = element_blank(), # no minor grid lines
    panel.spacing.x = unit(20, "pt"), # increase white space between panels
    panel.spacing.y = unit(10, "pt"), # increase white space between panels
    plot.margin = margin(rep(15, 4)) # adjust white space around plot
  )

Adjust Legend

g4 +
  # adjust symbol size in legend
  guides(
    color = guide_legend(override.aes = list(size = 4))
  )

Adjust Legend

g4 +
  scale_color_manual(
    values = colors,
    labels = c("Winter", "Spring", "Summer", "Autumn"),
    # adjust symbol size in legend size
    guide = guide_legend(override.aes = list(size = 4))
  )

Full Code

# create named color vector
colors <- c(
  `0` = "#1EC98D",
  `1` = "#F7B01B",
  `2` = "#A26E7C",
  `3` = "#6681FE"
)

# scatter plot of plot bikes$count versus bikes$temp_feel
ggplot(bikes, aes(x = temp_feel, y = count)) + 
  # add points
  geom_point(
    # color mapping only applied to points
    aes(color = season), 
    # setting larger points with 50% opacity
    alpha = .5, size = 1.5
  ) + 
  # add a smoothing
  stat_smooth(  # also: geom_smooth()
    # use linear fitting + draw black smoothing lines
    method = "lm", color = "black"
  ) +
  # small multiples
  facet_grid(
    day_night ~ year,  # also: vars(day_night), vars(year)
    # free y axis range
    scales = "free_y", 
    # scale heights proportionally 
    space = "free_y"
  ) +
  # x axis
  scale_x_continuous(
    # add °C symbol
    labels = function(x) paste0(x, "°C"), 
    # use 5°C spacing
    breaks = -1:6*5  # also: seq(-5, 30, by = 5)
  ) +
  # y axis
  scale_y_continuous(
    # add a thousand separator
    labels = scales::label_comma(), 
    # use consistent spacing across rows
    breaks = 0:5*10000
  ) +
  # colors
  scale_color_manual(
    # use a custom color palette
    values = colors,
    # overwrite legend keys
    labels = c("Winter", "Spring", "Summer", "Autumn"),
    # adjust symbol size in legend size
    guide = guide_legend(override.aes = list(size = 4))
  ) +
  labs(
    # overwrite axis and legend titles
    x = "Average feels-like temperature", y = NULL, color = NULL,
    # add plot title and caption
    title = "Trends of Reported Bike Rents versus Feels-Like Temperature in London",
    caption = "Data: TfL (Transport for London), Jan 2015–Dec 2016"
  ) +
  # add theme with a custom font + larger element sizes
  theme_light(
    base_size = 15, base_family = "Spline Sans"
  ) +
  # theme adjustments
  theme(
    plot.title.position = "plot", # left-align title 
    plot.caption.position = "plot", # right-align caption
    legend.position = "top", # place legend above plot
    plot.title = element_text(face = "bold", size = rel(1.4)), # larger, bold title
    axis.text = element_text(family = "Spline Sans Mono"), # monospaced font for axes
    axis.title.x = element_text( # left-aligned, grey x axis label
      hjust = 0, color = "grey20", margin = margin(t = 12)
    ),
    legend.text = element_text(size = rel(1)), # larger legend labels
    strip.text = element_text(face = "bold", size = rel(1.15)), # larger, bold facet labels
    panel.grid.major.x = element_blank(), # no vertical major lines
    panel.grid.minor = element_blank(), # no minor grid lines
    panel.spacing.x = unit(20, "pt"), # increase white space between panels
    panel.spacing.y = unit(10, "pt"), # increase white space between panels
    plot.margin = margin(rep(15, 4)) # adjust white space around plot
  )

Saving Plots

Save the Graphic


ggsave(filename = "my_plot.png", plot = g)
ggsave("my_plot.png")


ggsave("my_plot.png", width = 6, height = 5, dpi = 600)

Plot Resolution

Plot Resolution

Save the Graphic


ggsave(filename = "my_plot.png", plot = g)
ggsave("my_plot.png")


ggsave("my_plot.png", width = 6, height = 5, dpi = 600)
ggsave("my_plot.png", width = 6*2.54, height = 5*2.54, unit = "cm", dpi = 600)


ggsave("my_plot.png", device = agg_png)
ggsave("my_plot.pdf", device = cairo_pdf)


A comparison of vector and raster graphics.

Modified from canva.com

The {ragg} Package

provides drop-in replacements for the default raster graphic devices

  • faster
  • direct access to all system fonts
  • advanced text rendering
    • including support for right-to-left text, emojis, and font fallback
  • high quality anti-aliasing
  • high quality rotated text
  • supports 16-bit output
  • system independent rendering

The {ragg} Package


A comparison of different graphic devices in R comparing the rendering of right-to-left text (and mixing left-to-right and right-to-left text).

Source: tidyverse.org/blog/2021/02/modern-text-features

The {ragg} Package


A comparison of different graphic devices in R comparing the rendering of font ligatures.

Source: tidyverse.org/blog/2021/02/modern-text-features

The {ragg} Package


A comparison of different graphic devices in R comparing the rendering of emojis.

Source: tidyverse.org/blog/2021/02/modern-text-features

The {ragg} Package


A comparison of different graphic devices in R comparing the rendering missing glyphs, partly making use of fallback fonts.

Source: tidyverse.org/blog/2021/02/modern-text-features

The {ragg} Package

  • use {ragg} when saving ggplots by passing agg device function: ggsave(device = agg_png)) (used by default if installed)
  • use {ragg} in the Rstudio Plots pane be setting the backend to AGG: How to set the AGG device as the default in RStudio via Global Options > > General > Graphics > Backend.
  • use {ragg} when knitting Rmarkdown files by setting dev="ragg_png" in the code chunk options.

Save the Graphic


ggsave(filename = "my_plot.png", plot = g)
ggsave("my_plot.png")


ggsave("my_plot.png", width = 6, height = 5, dpi = 600)
ggsave("my_plot.png", width = 6*2.54, height = 5*2.54, unit = "cm", dpi = 600)


ggsave("my_plot.png", device = agg_png)
ggsave("my_plot.pdf", device = cairo_pdf)
ggsave("my_plot.svg")

How to Work with Aspect Ratios

  • don’t rely on the Rstudio viewer pane!
  • once you have a “it’s getting close” prototype, settle on a plot size

  • Approach 1: save the file and inspect it—go back to your IDE—repeat
    • tedious and time-consuming…

  • Approach 2: use a qmd or rmd with inline output and chunk settings
    • set fig-width / fig.width and fig-height / fig.height
      per chunk or globally

Setting Plot Sizes in Quarto and Rmarkdown

A screenshot of an exemplary Qmd file with two chunks with different settings of fig-width and fig-height as YAML-styled options using the hashpipe. Unfortunately, these are not respected when the chunk output is printed inline.

Setting Plot Sizes in Quarto and Rmarkdown

A screenshot of an exemplary Qmd file with two chunks with different settings of fig.width and fig.height set as inline chunk options.

How to Work with Aspect Ratios

  • don’t rely on the Rstudio viewer pane!
  • once you have a “it’s getting close” prototype, settle on a plot size

  • Approach 1: save the file and inspect it—go back to your IDE—repeat
    • tedious and time-consuming…

  • Approach 2: use a qmd or rmd with inline output and chunk settings
    • set fig-width / fig.width and fig-height / fig.height
      per chunk or globally

  • Approach 3: use our {camcorder} package
    • saves output from all ggplot() calls and displays it in the viewer pane

Setting Plot Sizes via {camcorder}


A screenshot of an exemplary R script with a plot automatically saved and isplayed in correct aspect ratio thanks to the camcorder package.

Setting Plot Sizes via {camcorder}


A screenshot of an exemplary R script with a plot automatically saved and isplayed in correct aspect ratio thanks to the camcorder package.

Setting Plot Sizes via {camcorder}

camcorder::gg_record(
  dir = here::here("temp"),  # path for plot files
  device = "png",            # device to use
  width = 10,                # figure width
  height = 5,                # figure height
  dpi = 600                  # plot resolution
)

g <- ggplot(bikes, aes(x = temp, y = count, color = day_night)) +
  geom_point(alpha = .3, size = 2) +
  scale_color_manual(values = c(day = "#FFA200", night = "#757BC7")) +
  theme_minimal(base_size = 14, base_family = "Asap SemiCondensed") +
  theme(panel.grid.minor = element_blank())

g

Setting Plot Sizes via {camcorder}

camcorder::gg_record(
  dir = here::here("temp"),  # path for plot files
  device = "png",            # device to use
  width = 10,                # figure width
  height = 5,                # figure height
  dpi = 600                  # plot resolution
)

g <- ggplot(bikes, aes(x = temp, y = count, color = day_night)) +
  geom_point(alpha = .3, size = 2) +
  scale_color_manual(values = c(day = "#FFA200", night = "#757BC7")) +
  theme_minimal(base_size = 14, base_family = "Asap SemiCondensed") +
  theme(panel.grid.minor = element_blank())

g

camcorder::gg_resize_film(width = 20) # update figure width

g

Like a Pro: Set Theme Globally

theme_set(theme_minimal(base_size = 14, base_family = "Asap SemiCondensed"))
theme_update(panel.grid.minor = element_blank())

Programming
with ggplot2

Conditional Components

smooth <- TRUE

ggplot(bikes, aes(x = temp, y = humidity)) +
  { if(smooth) geom_smooth(color = "red") } +
  geom_point(alpha = .5)

Conditional Components

smooth <- TRUE

ggplot(bikes, aes(x = temp, y = humidity)) +
  { if(smooth) geom_smooth(color = "red") } +
  geom_point(alpha = .5)

Conditional Components

smooth <- FALSE

ggplot(bikes, aes(x = temp, y = humidity)) +
  { if(smooth) geom_smooth(color = "red") } +
  geom_point(alpha = .5)

Wrapper Functions for Plots

draw_scatter <- function(smooth = TRUE) {
  ggplot(bikes, aes(x = temp, y = humidity)) +
    { if(smooth) geom_smooth(color = "red") } +
    geom_point(alpha = .5)
}

Wrapper Functions for Plots

draw_scatter()

Wrapper Functions for Plots

draw_scatter(smooth = FALSE)

Components as Functions

geom_scatterfit <- function(pointsize = 1, pointalpha = 1, 
                            method = "lm", linecolor = "red", ...) {
  list(
    geom_point(size = pointsize, alpha = pointalpha, ...),
    geom_smooth(method = method, color = linecolor, ...)
  )
}

Components as Functions

ggplot(bikes,
       aes(x = humidity, y = count)) +
  geom_scatterfit()

Components as Functions

ggplot(bikes,
       aes(x = humidity, y = count)) +
  geom_scatterfit(
    color = "#28A87D", 
    linewidth = 3
  )

Components as Functions

ggplot(diamonds, 
       aes(x = carat, y = price)) +
  geom_scatterfit(
    pointsize = .5, 
    pointalpha = .1,
    method = "gam",
    linecolor = "#EFAC00"
  )

Components as Functions

scales_log <- function(sides = "xy") {
  list(
    if(stringr::str_detect(sides, "x")) {
      scale_x_log10(
        breaks = c(10^(1:100)), labels = scales::label_log()
      )
    },
    if(stringr::str_detect(sides, "y")) {
      scale_y_log10(
        breaks = c(10^(1:100)), labels = scales::label_log()
      )
    }
  )
}

Components as Functions

ggplot(diamonds, 
       aes(x = carat, y = price)) +
  geom_scatterfit(
    pointsize = .5, 
    pointalpha = .1,
    method = "gam",
    linecolor = "#EFAC00"
  ) +
  scales_log(sides = "y")

Iterative Graphics

trends_monthly <- function(grp = "January") {
  bikes |> 
    dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = FALSE)) |> 
    dplyr::filter(month %in% grp) |> 
    ggplot(aes(x = temp, y = count, color = day_night)) +
    geom_point(alpha = .2, show.legend = FALSE) +
    geom_smooth(se = FALSE) +
    scale_color_manual(values = c("#FFA200", "#757bc7")) +
    labs(title = grp, x = "Temperature", y = "Bike shares", color = NULL)
}

Iterative Graphics

trends_monthly("July")

Iterative Graphics

trends_monthly <- function(grp = "January") {
  bikes |> 
    dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = FALSE)) |> 
    dplyr::filter(month %in% grp) |> 
    ggplot(aes(x = temp, y = count, color = day_night)) +
    geom_point(alpha = .2, show.legend = FALSE) +
    geom_smooth(se = FALSE) +
    # keep axis ranges consistent
    scale_x_continuous(limits = range(bikes$temp)) +
    scale_y_continuous(limits = range(bikes$count)) +
    scale_color_manual(values = c("#FFA200", "#757bc7")) +
    labs(title = grp, x = "Temperature", y = "Bike shares", color = NULL)
}

Iterative Graphics

trends_monthly("July")

Iterative Graphics

plots <- purrr::map(month.name[1:12], trends_monthly) ## also: ~ trends_monthly(.x)

Iterative Graphics

plots <- purrr::map(month.name[1:12], trends_monthly) ## also: ~ trends_monthly(.x)
plots[[9]]

Iterative Graphics

plots <- purrr::map(month.name[1:12], trends_monthly) ## also: ~ trends_monthly(.x)
patchwork::wrap_plots(plots)

Iterative Graphics

plot_density <- function(data, var, grp = "") {
  ggplot(data, aes(x = !!sym(var))) +
    geom_density(aes(fill = !!sym(grp)), position = "identity",
                 color = "grey30", alpha = .3) +
    coord_cartesian(expand = FALSE, clip = "off") +
    scale_y_continuous(labels = scales::label_number()) +
    scale_fill_brewer(palette = "Dark2", name = NULL) +
    theme(legend.position = "top")
}

Iterative Graphics

plot_density(
  bikes, "count"
)

Iterative Graphics

plots <- purrr::map(
  c("count", "temp", "humidity", "wind_speed"), 
  ~ plot_density(data = bikes, var = .x, grp = "day_night")
)
patchwork::wrap_plots(plots, nrow = 1)

Iterative Graphics

plots <- purrr::map(
  names(dplyr::select(midwest, where(is.numeric))),
  ~plot_density(data = midwest, var = .x)
)
patchwork::wrap_plots(plots)

Combine Plots

Combine Plots with {patchwork}

library(patchwork)

p1 <- plot_density(data = bikes, var = "count", grp = "day_night")

p2 <- plot_density(data = bikes, var = "humidity", grp = "day_night")

p3 <- ggplot(bikes, aes(x = humidity, y = count)) + geom_scatterfit(pointalpha = .3)

Combine Plots with {patchwork}

(p1 + p2) / p3

Combine Plots with {patchwork}

(p1 + p2) / p3 + plot_layout(heights = c(1, 2))

Combine Plots with {patchwork}

(p1 + p2) / p3 + plot_layout(heights = c(1, 2), guides = "collect")

Combine Plots with {patchwork}

(p1 + p2) / p3 + plot_layout(heights = c(1, 2), guides = "collect") +
  plot_annotation(theme = theme(legend.justification = "top"))

Combine Plots with {patchwork}

(p1 + p2) / p3 + plot_layout(heights = c(1, 2), guides = "collect") +
  plot_annotation(tag_levels = "A", tag_suffix = ".", theme = theme(legend.justification = "top"))

Exciting Extension Packages

Layers

Layers (continued)

Utilities

Themes

Color Palettes

Interactive Charts

* not using ggplot2

Recap

  • a basic ggplot is build by specifying three components:
    data, aesthetics and a layer (usually a geom_* or stat_*)
  • aesthetic mappings define how variables map to visual properties
  • the default appearance of all other components can be modified via scale_*, coord_*, facet_* and theme_* / theme
  • use the devices cairo (pdf) and agg (png, jpg, tiff) when saving plots
  • find a suitable plot size by setting figure chunk options in qmd/rmd files or with the help of the {camcorder} package
  • define conditional components, custom layers and functions to generate plots more efficiently and to iterate over multiple inputs
  • combine multiple plot outputs with {patchwork}

Exercises

Exercise 1

  • Discuss / investigate with your neighbor:
    • What are the differences between geom_line() and geom_path()?
    • Why can you use geom_smooth() and stat_smooth() interchangeably?
    • What are the three ways to remove a legend from a ggplot?

Exercise 2

  • Explore the TfL bike share data visually:
    • Create a time series of counts per day and night.
    • Draw box and whisker plots of average temperatures per month.
    • Visualize bike counts per weather type and period as bar chart.
  • Combine the three plots with {patchwork}.
  • Export the final graphic in a format of your choice.