Fundamentals & Workflows
{ggplot2}
is a system for declaratively creating graphics,
based on “The Grammar of Graphics” (Wilkinson 2005).
You provide the data, tell {ggplot2}
how to map variables to aesthetics,
what graphical primitives to use, and it takes care of the details.
Component | Function | Explanation |
---|---|---|
Data |
ggplot(data)
|
The raw data that you want to visualize (initialising a plot). |
Aesthetics |
aes()
|
The mapping between variables and visual properties. |
Geometries |
geom_*()
|
The geometric shape of a layer representing the data. |
ggplot2 examples featured on ggplot2.tidyverse.org
Component | Function | Explanation |
---|---|---|
Data |
ggplot(data)
|
The raw data that you want to visualize (initializing a plot). |
Aesthetics |
aes()
|
The mapping between variables and visual properties. |
Geometries |
geom_*()
|
The geometric shape of a layer representing the data. |
Statistics |
stat_*()
|
The statistical transformation of a layer applied to the data. |
Scales |
scale_*()
|
The representation of mapped aesthetic attributes. |
Coordinate System |
coord_*()
|
The transformation to map data coordinates into the plot plane. |
Facets |
facet_*()
|
The arrangement of the data into a set of small multiples. |
Visual Themes |
theme() | theme_*()
|
The overall visual defaults of non-data elements of the graphic. |
Illustration by Allison Horst
Illustration by Allison Horst
Collection of Graphics from the BBC R Cookbook
Collection of Graphics from the BBC R Cookbook
“Netflix Content Explosion” by Tanya Shapiro
My reinterpreted The Economist graphic
“Popular Programming Languages in CRAN Packages” by Torsten Sprenger
“Artists in the US” by Lee Olney
My Contribution to the SWDchallenge “Small Multiples”
“European Energy Generation” by Jack Davison
Moon Charts as a Tile Grid Map showing the 2nd Vote Results from the German Election 2021
Our Winning Contribution to the BES MoveMap Contest
Bivariate Choropleth x Hillshade Map by Timo Gossenbacher
Pixel Art by Georgios Karamanis
Generative Art by Thomas Lin Pedersen
Bike sharing counts in London, UK, powered by TfL Open Data
# A tibble: 1,454 × 14
date day_night year month season count is_workday is_weekend is_holiday temp temp_feel humidity wind_speed weather_type
<date> <chr> <fct> <fct> <fct> <int> <lgl> <lgl> <lgl> <dbl> <dbl> <dbl> <dbl> <chr>
1 2015-01-04 day 2015 1 3 6830 FALSE TRUE FALSE 2.17 -0.75 95.2 10.4 broken clouds
2 2015-01-04 night 2015 1 3 2404 FALSE TRUE FALSE 2.79 2.04 93.4 4.58 clear
3 2015-01-05 day 2015 1 3 14763 TRUE FALSE FALSE 8.96 7.71 81.1 8.67 broken clouds
4 2015-01-05 night 2015 1 3 5609 TRUE FALSE FALSE 7.12 5.71 79.5 9.04 cloudy
5 2015-01-06 day 2015 1 3 14501 TRUE FALSE FALSE 9 6.46 80.2 19.2 broken clouds
6 2015-01-06 night 2015 1 3 6112 TRUE FALSE FALSE 6.71 4.21 77.6 12.8 clear
7 2015-01-07 day 2015 1 3 16358 TRUE FALSE FALSE 8.17 5.08 75.2 21.2 scattered clouds
8 2015-01-07 night 2015 1 3 4706 TRUE FALSE FALSE 6.68 3.86 81.3 18.1 clear
9 2015-01-08 day 2015 1 3 9971 TRUE FALSE FALSE 9.46 7.12 79.4 18.8 scattered clouds
10 2015-01-08 night 2015 1 3 5630 TRUE FALSE FALSE 10.0 8.46 79.2 22.2 clear
# ℹ 1,444 more rows
Variable | Description | Class |
---|---|---|
date | Date encoded as `YYYY-MM-DD` | date |
day_night | `day` (6:00am–5:59pm) or `night` (6:00pm–5:59am) | character |
year | `2015` or `2016` | factor |
month | `1` (January) to `12` (December) | factor |
season | `0` (spring), `1` (summer), `2` (autumn), or `3` (winter) | factor |
count | Sum of reported bikes rented | integer |
is_workday | `TRUE` being Monday to Friday and no bank holiday | logical |
is_weekend | `TRUE` being Saturday or Sunday | logical |
is_holiday | `TRUE` being a bank holiday in the UK | logical |
temp | Average air temperature (°C) | double |
temp_feel | Average feels like temperature (°C) | double |
humidity | Average air humidity (%) | double |
wind_speed | Average wind speed (km/h) | double |
weather_type | Most common weather type | character |
g2 <- g1 +
# x axis
scale_x_continuous(
# add °C symbol
labels = function(x) paste0(x, "°C"),
# use 5°C spacing
breaks = -1:6*5 # also: seq(-5, 30, by = 5)
) +
# y axis
scale_y_continuous(
# add a thousand separator
labels = scales::label_comma(),
# use consistent spacing across rows
breaks = 0:5*10000
)
g3 <- g2 +
scale_color_manual(
values = colors,
labels = c("Winter", "Spring", "Summer", "Autumn")
) +
labs(
# overwrite axis and legend titles
x = "Average feels-like temperature", y = NULL, color = NULL,
# add plot title and caption
title = "Trends of Reported Bike Rents versus Feels-Like Temperature in London",
caption = "Data: TfL (Transport for London), Jan 2015–Dec 2016"
)
g4 <- g3 +
theme_light(base_size = 15, base_family = "Spline Sans") +
# theme adjustments
theme(
plot.title.position = "plot", # left-align title
plot.caption.position = "plot", # right-align caption
legend.position = "top", # place legend above plot
plot.title = element_text(face = "bold", size = rel(1.4)), # larger, bold title
axis.text = element_text(family = "Spline Sans Mono"), # monospaced font for axes
axis.title.x = element_text( # left-aligned, grey x axis label
hjust = 0, color = "grey20", margin = margin(t = 12)
),
legend.text = element_text(size = rel(1)), # larger legend labels
strip.text = element_text(face = "bold", size = rel(1.15)), # larger, bold facet labels
panel.grid.major.x = element_blank(), # no vertical major lines
panel.grid.minor = element_blank(), # no minor grid lines
panel.spacing.x = unit(20, "pt"), # increase white space between panels
panel.spacing.y = unit(10, "pt"), # increase white space between panels
plot.margin = margin(rep(15, 4)) # adjust white space around plot
)
# create named color vector
colors <- c(
`0` = "#1EC98D",
`1` = "#F7B01B",
`2` = "#A26E7C",
`3` = "#6681FE"
)
# scatter plot of plot bikes$count versus bikes$temp_feel
ggplot(bikes, aes(x = temp_feel, y = count)) +
# add points
geom_point(
# color mapping only applied to points
aes(color = season),
# setting larger points with 50% opacity
alpha = .5, size = 1.5
) +
# add a smoothing
stat_smooth( # also: geom_smooth()
# use linear fitting + draw black smoothing lines
method = "lm", color = "black"
) +
# small multiples
facet_grid(
day_night ~ year, # also: vars(day_night), vars(year)
# free y axis range
scales = "free_y",
# scale heights proportionally
space = "free_y"
) +
# x axis
scale_x_continuous(
# add °C symbol
labels = function(x) paste0(x, "°C"),
# use 5°C spacing
breaks = -1:6*5 # also: seq(-5, 30, by = 5)
) +
# y axis
scale_y_continuous(
# add a thousand separator
labels = scales::label_comma(),
# use consistent spacing across rows
breaks = 0:5*10000
) +
# colors
scale_color_manual(
# use a custom color palette
values = colors,
# overwrite legend keys
labels = c("Winter", "Spring", "Summer", "Autumn"),
# adjust symbol size in legend size
guide = guide_legend(override.aes = list(size = 4))
) +
labs(
# overwrite axis and legend titles
x = "Average feels-like temperature", y = NULL, color = NULL,
# add plot title and caption
title = "Trends of Reported Bike Rents versus Feels-Like Temperature in London",
caption = "Data: TfL (Transport for London), Jan 2015–Dec 2016"
) +
# add theme with a custom font + larger element sizes
theme_light(
base_size = 15, base_family = "Spline Sans"
) +
# theme adjustments
theme(
plot.title.position = "plot", # left-align title
plot.caption.position = "plot", # right-align caption
legend.position = "top", # place legend above plot
plot.title = element_text(face = "bold", size = rel(1.4)), # larger, bold title
axis.text = element_text(family = "Spline Sans Mono"), # monospaced font for axes
axis.title.x = element_text( # left-aligned, grey x axis label
hjust = 0, color = "grey20", margin = margin(t = 12)
),
legend.text = element_text(size = rel(1)), # larger legend labels
strip.text = element_text(face = "bold", size = rel(1.15)), # larger, bold facet labels
panel.grid.major.x = element_blank(), # no vertical major lines
panel.grid.minor = element_blank(), # no minor grid lines
panel.spacing.x = unit(20, "pt"), # increase white space between panels
panel.spacing.y = unit(10, "pt"), # increase white space between panels
plot.margin = margin(rep(15, 4)) # adjust white space around plot
)
Modified from canva.com
provides drop-in replacements for the default raster graphic devices
ggsave(device = agg_png))
(used by default if installed)dev="ragg_png"
in the code chunk options.fig-width
/ fig.width
and fig-height
/ fig.height
fig-width
/ fig.width
and fig-height
/ fig.height
ggplot()
calls and displays it in the viewer panecamcorder::gg_record(
dir = here::here("temp"), # path for plot files
device = "png", # device to use
width = 10, # figure width
height = 5, # figure height
dpi = 600 # plot resolution
)
g <- ggplot(bikes, aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .3, size = 2) +
scale_color_manual(values = c(day = "#FFA200", night = "#757BC7")) +
theme_minimal(base_size = 14, base_family = "Asap SemiCondensed") +
theme(panel.grid.minor = element_blank())
g
camcorder::gg_record(
dir = here::here("temp"), # path for plot files
device = "png", # device to use
width = 10, # figure width
height = 5, # figure height
dpi = 600 # plot resolution
)
g <- ggplot(bikes, aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .3, size = 2) +
scale_color_manual(values = c(day = "#FFA200", night = "#757BC7")) +
theme_minimal(base_size = 14, base_family = "Asap SemiCondensed") +
theme(panel.grid.minor = element_blank())
g
camcorder::gg_resize_film(width = 20) # update figure width
g
trends_monthly <- function(grp = "January") {
bikes |>
dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = FALSE)) |>
dplyr::filter(month %in% grp) |>
ggplot(aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .2, show.legend = FALSE) +
geom_smooth(se = FALSE) +
scale_color_manual(values = c("#FFA200", "#757bc7")) +
labs(title = grp, x = "Temperature", y = "Bike shares", color = NULL)
}
trends_monthly <- function(grp = "January") {
bikes |>
dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = FALSE)) |>
dplyr::filter(month %in% grp) |>
ggplot(aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .2, show.legend = FALSE) +
geom_smooth(se = FALSE) +
# keep axis ranges consistent
scale_x_continuous(limits = range(bikes$temp)) +
scale_y_continuous(limits = range(bikes$count)) +
scale_color_manual(values = c("#FFA200", "#757bc7")) +
labs(title = grp, x = "Temperature", y = "Bike shares", color = NULL)
}
plot_density <- function(data, var, grp = "") {
ggplot(data, aes(x = !!sym(var))) +
geom_density(aes(fill = !!sym(grp)), position = "identity",
color = "grey30", alpha = .3) +
coord_cartesian(expand = FALSE, clip = "off") +
scale_y_continuous(labels = scales::label_number()) +
scale_fill_brewer(palette = "Dark2", name = NULL) +
theme(legend.position = "top")
}
{geofacet}
— tile grid maps{ggalluvial}
— alluvial plots{ggalt}
— dumbbell, horizon, and lollipop charts, splines, …{ggbeeswarm}
— beeswarm plots and variants{ggbraid}
— ribbons for alternating groups{ggbump}
— parallel sets, pie charts, geometries, splines, voronoi, …{ggdensity}
— improved density plots{ggdist}
— uncertainty visualizations{ggforce}
— several interesting layers (and more){ggpattern}
— pattern fills for layers{ggpointdensity}
— density gradients for scatter plots{ggraph}
— networks, graphs & trees{ggridges}
— ridgeline plots{ggsankey}
— sankey diagrams{ggsignif}
— significance levels{ggstar}
— more point shapes{ggstream}
— stream graphs{ggupset}
— upset graphs{treemapify}
— treemaps{cowplot}
— combine ggplots{ggannotate}
— point-n-click annotations{ggblend}
— blend, compose, adjust layers{ggfittext}
— scale text according to space{ggfx}
— shaders and filters for layers{ggh4x}
— facets, positions, and more{ggtext}
— text rendering for theme elements + text layers{lemon}
— axis lines (and a few layers){patchwork}
— combine ggplots{scales}
— control scales{ggdark}
{ggsci}
(also color scales){ggtech}
(also color scales){ggthemes}
(also color scales){ggthemr}
{hrbrthemes}
(also color scales){tvthemes}
(also color scales){ggiraph}
{plotly}
{echarts4r}
*{highcharter}
*{charter}
*{streamgraph}
*{tmap}
*{leaflet}
*{globe4r}
*{grapher}
** not using ggplot2
data
, aes
thetics and a layer
(usually a geom_*
or stat_*
)scale_*
, coord_*
, facet_*
and theme_*
/ theme
{camcorder}
package{patchwork}
geom_line()
and geom_path()
?geom_smooth()
and stat_smooth()
interchangeably?Cédric Scherer // posit::conf(2023)