Fundamentals & Workflows
{ggplot2}
is a system for declaratively creating graphics,
based on “The Grammar of Graphics” (Wilkinson 2005).
You provide the data, tell {ggplot2}
how to map variables to aesthetics,
what graphical primitives to use, and it takes care of the details.
Component | Function | Explanation |
---|---|---|
Data |
ggplot(data)
|
The raw data that you want to visualize (initialising a plot). |
Aesthetics |
aes()
|
The mapping between variables and visual properties. |
Geometries |
geom_*()
|
The geometric shape of a layer representing the data. |
Component | Function | Explanation |
---|---|---|
Data |
ggplot(data)
|
The raw data that you want to visualize (initializing a plot). |
Aesthetics |
aes()
|
The mapping between variables and visual properties. |
Geometries |
geom_*()
|
The geometric shape of a layer representing the data. |
Statistics |
stat_*()
|
The statistical transformation of a layer applied to the data. |
Scales |
scale_*()
|
The representation of mapped aesthetic attributes. |
Coordinate System |
coord_*()
|
The transformation to map data coordinates into the plot plane. |
Facets |
facet_*()
|
The arrangement of the data into a set of small multiples. |
Visual Themes |
theme() | theme_*()
|
The overall visual defaults of non-data elements of the graphic. |
Bike sharing counts in London, UK, powered by TfL Open Data
# A tibble: 1,454 × 14
date day_night year month season count is_workday is_weekend is_holiday temp temp_feel humidity wind_speed weather_type
<date> <chr> <fct> <fct> <fct> <int> <lgl> <lgl> <lgl> <dbl> <dbl> <dbl> <dbl> <chr>
1 2015-01-04 day 2015 1 3 6830 FALSE TRUE FALSE 2.17 -0.75 95.2 10.4 broken clouds
2 2015-01-04 night 2015 1 3 2404 FALSE TRUE FALSE 2.79 2.04 93.4 4.58 clear
3 2015-01-05 day 2015 1 3 14763 TRUE FALSE FALSE 8.96 7.71 81.1 8.67 broken clouds
4 2015-01-05 night 2015 1 3 5609 TRUE FALSE FALSE 7.12 5.71 79.5 9.04 cloudy
5 2015-01-06 day 2015 1 3 14501 TRUE FALSE FALSE 9 6.46 80.2 19.2 broken clouds
6 2015-01-06 night 2015 1 3 6112 TRUE FALSE FALSE 6.71 4.21 77.6 12.8 clear
7 2015-01-07 day 2015 1 3 16358 TRUE FALSE FALSE 8.17 5.08 75.2 21.2 scattered clouds
8 2015-01-07 night 2015 1 3 4706 TRUE FALSE FALSE 6.68 3.86 81.3 18.1 clear
9 2015-01-08 day 2015 1 3 9971 TRUE FALSE FALSE 9.46 7.12 79.4 18.8 scattered clouds
10 2015-01-08 night 2015 1 3 5630 TRUE FALSE FALSE 10.0 8.46 79.2 22.2 clear
# ℹ 1,444 more rows
Variable | Description | Class |
---|---|---|
date | Date encoded as `YYYY-MM-DD` | date |
day_night | `day` (6:00am–5:59pm) or `night` (6:00pm–5:59am) | character |
year | `2015` or `2016` | factor |
month | `1` (January) to `12` (December) | factor |
season | `0` (spring), `1` (summer), `2` (autumn), or `3` (winter) | factor |
count | Sum of reported bikes rented | integer |
is_workday | `TRUE` being Monday to Friday and no bank holiday | logical |
is_weekend | `TRUE` being Saturday or Sunday | logical |
is_holiday | `TRUE` being a bank holiday in the UK | logical |
temp | Average air temperature (°C) | double |
temp_feel | Average feels like temperature (°C) | double |
humidity | Average air humidity (%) | double |
wind_speed | Average wind speed (km/h) | double |
weather_type | Most common weather type | character |
g2 <- g1 +
# x axis
scale_x_continuous(
# add °C symbol
labels = function(x) paste0(x, "°C"),
# use 5°C spacing
breaks = -1:6*5 # also: seq(-5, 30, by = 5)
) +
# y axis
scale_y_continuous(
# add a thousand separator
labels = scales::label_comma(),
# use consistent spacing across rows
breaks = 0:5*10000
)
g3 <- g2 +
scale_color_manual(
values = colors,
labels = c("Winter", "Spring", "Summer", "Autumn")
) +
labs(
# overwrite axis and legend titles
x = "Average feels-like temperature", y = NULL, color = NULL,
# add plot title and caption
title = "Trends of Reported Bike Rents versus Feels-Like Temperature in London",
caption = "Data: TfL (Transport for London), Jan 2015–Dec 2016"
)
g4 <- g3 +
theme_light(base_size = 15, base_family = "Spline Sans") +
# theme adjustments
theme(
plot.title.position = "plot", # left-align title
plot.caption.position = "plot", # right-align caption
legend.position = "top", # place legend above plot
plot.title = element_text(face = "bold", size = rel(1.4)), # larger, bold title
axis.text = element_text(family = "Spline Sans Mono"), # monospaced font for axes
axis.title.x = element_text( # left-aligned, grey x axis label
hjust = 0, color = "grey20", margin = margin(t = 12)
),
legend.text = element_text(size = rel(1)), # larger legend labels
strip.text = element_text(face = "bold", size = rel(1.15)), # larger, bold facet labels
panel.grid.major.x = element_blank(), # no vertical major lines
panel.grid.minor = element_blank(), # no minor grid lines
panel.spacing.x = unit(20, "pt"), # increase white space between panels
panel.spacing.y = unit(10, "pt"), # increase white space between panels
plot.margin = margin(rep(15, 4)) # adjust white space around plot
)
# create named color vector
colors <- c(
`0` = "#1EC98D",
`1` = "#F7B01B",
`2` = "#A26E7C",
`3` = "#6681FE"
)
# scatter plot of plot bikes$count versus bikes$temp_feel
ggplot(bikes, aes(x = temp_feel, y = count)) +
# add points
geom_point(
# color mapping only applied to points
aes(color = season),
# setting larger points with 50% opacity
alpha = .5, size = 1.5
) +
# add a smoothing
stat_smooth( # also: geom_smooth()
# use linear fitting + draw black smoothing lines
method = "lm", color = "black"
) +
# small multiples
facet_grid(
day_night ~ year, # also: vars(day_night), vars(year)
# free y axis range
scales = "free_y",
# scale heights proportionally
space = "free_y"
) +
# x axis
scale_x_continuous(
# add °C symbol
labels = function(x) paste0(x, "°C"),
# use 5°C spacing
breaks = -1:6*5 # also: seq(-5, 30, by = 5)
) +
# y axis
scale_y_continuous(
# add a thousand separator
labels = scales::label_comma(),
# use consistent spacing across rows
breaks = 0:5*10000
) +
# colors
scale_color_manual(
# use a custom color palette
values = colors,
# overwrite legend keys
labels = c("Winter", "Spring", "Summer", "Autumn"),
# adjust symbol size in legend size
guide = guide_legend(override.aes = list(size = 4))
) +
labs(
# overwrite axis and legend titles
x = "Average feels-like temperature", y = NULL, color = NULL,
# add plot title and caption
title = "Trends of Reported Bike Rents versus Feels-Like Temperature in London",
caption = "Data: TfL (Transport for London), Jan 2015–Dec 2016"
) +
# add theme with a custom font + larger element sizes
theme_light(
base_size = 15, base_family = "Spline Sans"
) +
# theme adjustments
theme(
plot.title.position = "plot", # left-align title
plot.caption.position = "plot", # right-align caption
legend.position = "top", # place legend above plot
plot.title = element_text(face = "bold", size = rel(1.4)), # larger, bold title
axis.text = element_text(family = "Spline Sans Mono"), # monospaced font for axes
axis.title.x = element_text( # left-aligned, grey x axis label
hjust = 0, color = "grey20", margin = margin(t = 12)
),
legend.text = element_text(size = rel(1)), # larger legend labels
strip.text = element_text(face = "bold", size = rel(1.15)), # larger, bold facet labels
panel.grid.major.x = element_blank(), # no vertical major lines
panel.grid.minor = element_blank(), # no minor grid lines
panel.spacing.x = unit(20, "pt"), # increase white space between panels
panel.spacing.y = unit(10, "pt"), # increase white space between panels
plot.margin = margin(rep(15, 4)) # adjust white space around plot
)
provides drop-in replacements for the default raster graphic devices
ggsave(device = agg_png))
(used by default if installed)dev="ragg_png"
in the code chunk options.fig-width
/ fig.width
and fig-height
/ fig.height
fig-width
/ fig.width
and fig-height
/ fig.height
ggplot()
calls and displays it in the viewer panecamcorder::gg_record(
dir = here::here("temp"), # path for plot files
device = "png", # device to use
width = 10, # figure width
height = 5, # figure height
dpi = 600 # plot resolution
)
g <- ggplot(bikes, aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .3, size = 2) +
scale_color_manual(values = c(day = "#FFA200", night = "#757BC7")) +
theme_minimal(base_size = 14, base_family = "Asap SemiCondensed") +
theme(panel.grid.minor = element_blank())
g
camcorder::gg_record(
dir = here::here("temp"), # path for plot files
device = "png", # device to use
width = 10, # figure width
height = 5, # figure height
dpi = 600 # plot resolution
)
g <- ggplot(bikes, aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .3, size = 2) +
scale_color_manual(values = c(day = "#FFA200", night = "#757BC7")) +
theme_minimal(base_size = 14, base_family = "Asap SemiCondensed") +
theme(panel.grid.minor = element_blank())
g
camcorder::gg_resize_film(width = 20) # update figure width
g
trends_monthly <- function(grp = "January") {
bikes |>
dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = FALSE)) |>
dplyr::filter(month %in% grp) |>
ggplot(aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .2, show.legend = FALSE) +
geom_smooth(se = FALSE) +
scale_color_manual(values = c("#FFA200", "#757bc7")) +
labs(title = grp, x = "Temperature", y = "Bike shares", color = NULL)
}
trends_monthly <- function(grp = "January") {
bikes |>
dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = FALSE)) |>
dplyr::filter(month %in% grp) |>
ggplot(aes(x = temp, y = count, color = day_night)) +
geom_point(alpha = .2, show.legend = FALSE) +
geom_smooth(se = FALSE) +
# keep axis ranges consistent
scale_x_continuous(limits = range(bikes$temp)) +
scale_y_continuous(limits = range(bikes$count)) +
scale_color_manual(values = c("#FFA200", "#757bc7")) +
labs(title = grp, x = "Temperature", y = "Bike shares", color = NULL)
}
plot_density <- function(data, var, grp = "") {
ggplot(data, aes(x = !!sym(var))) +
geom_density(aes(fill = !!sym(grp)), position = "identity",
color = "grey30", alpha = .3) +
coord_cartesian(expand = FALSE, clip = "off") +
scale_y_continuous(labels = scales::label_number()) +
scale_fill_brewer(palette = "Dark2", name = NULL) +
theme(legend.position = "top")
}
{geofacet}
— tile grid maps{ggalluvial}
— alluvial plots{ggalt}
— dumbbell, horizon, and lollipop charts, splines, …{ggbeeswarm}
— beeswarm plots and variants{ggbraid}
— ribbons for alternating groups{ggbump}
— parallel sets, pie charts, geometries, splines, voronoi, …{ggdensity}
— improved density plots{ggdist}
— uncertainty visualizations{ggforce}
— several interesting layers (and more){ggpattern}
— pattern fills for layers{ggpointdensity}
— density gradients for scatter plots{ggraph}
— networks, graphs & trees{ggridges}
— ridgeline plots{ggsankey}
— sankey diagrams{ggsignif}
— significance levels{ggstar}
— more point shapes{ggstream}
— stream graphs{ggupset}
— upset graphs{treemapify}
— treemaps{cowplot}
— combine ggplots{ggannotate}
— point-n-click annotations{ggblend}
— blend, compose, adjust layers{ggfittext}
— scale text according to space{ggfx}
— shaders and filters for layers{ggh4x}
— facets, positions, and more{ggtext}
— text rendering for theme elements + text layers{lemon}
— axis lines (and a few layers){patchwork}
— combine ggplots{scales}
— control scales{ggdark}
{ggsci}
(also color scales){ggtech}
(also color scales){ggthemes}
(also color scales){ggthemr}
{hrbrthemes}
(also color scales){tvthemes}
(also color scales){ggiraph}
{plotly}
{echarts4r}
*{highcharter}
*{charter}
*{streamgraph}
*{tmap}
*{leaflet}
*{globe4r}
*{grapher}
** not using ggplot2
data
, aes
thetics and a layer
(usually a geom_*
or stat_*
)scale_*
, coord_*
, facet_*
and theme_*
/ theme
{camcorder}
package{patchwork}
geom_line()
and geom_path()
?geom_smooth()
and stat_smooth()
interchangeably?Cédric Scherer // posit::conf(2023)