install.packages("devtools")
install.packages("DT")
install.packages("fivethirtyeight")
install.packages("gapminder")
install.packages("ghclass")
install.packages("nycflights13")
install.packages("openintro")
install.packages("rvest")
install.packages("shiny")
install.packages("ggthemes")
install.packages("tidyverse")
install.packages("tidymodels")
install.packages("unvotes")
About
Overview
There has been significant innovation in introductory statistics and data science courses to equip students with the statistical, computing, and communication skills needed for modern data analysis. Success in data science and statistics is dependent on the development of both analytical and computational skills, and the demand for educators who are proficient at teaching both these skills is growing. The goal of this masterclass is to equip educators with concrete information on content, workflows, and infrastructure for painlessly introducing modern computation with R and RStudio within a data science curriculum. In a nutshell, the day you’ll spend in this workshop will save you endless hours of solo work designing and setting up your course.
Topics will cover teaching the tidyverse in 2023, highlighting updates to R for Data Science (2nd ed) and Data Science in a Box as well as present tooling options and workflows for reproducible authoring, computing infrastructure, version control, and collaboration.
The workshop will be comprised of four modules:
Teaching data science with the tidyverse and Quarto
Teaching data science with Git and GitHub
Organizing, publishing, and sharing of course materials
Computing infrastructure for teaching data science
Throughout each module we’ll shift between the student perspective and the instructor perspective. The activities and demos will be hands-on; attendees will also have the opportunity to exchange ideas and ask questions throughout the session.
In addition to gaining technical knowledge, participants will engage in discussion around the decisions that go into developing a data science curriculum and choosing workflows and infrastructure that best support the curriculum and allow for scalability. We will also discuss best practices for configuring and deploying classroom infrastructures to support these tools.
Is this course for me?
This masterclass is aimed primarily at participants teaching data science in an academic setting in semester-long courses, however much of the information and tooling we introduce is applicable for shorter teaching experiences like workshops and bootcamps as well. Basic knowledge of R is assumed and familiarity with the tidyverse and Git is preferred.
This course is for you if you:
you want to learn / discuss curriculum, pedagogy, and computing infrastructure design for teaching data science with R and RStudio using the tidyverse and Quarto
you are interested in setting up your class in Posit Cloud
you want to integrate version control with git into your teaching and learn about tools and best practices for running your course on GitHub
Prework
In this workshop you will wear two hats: the educator and the learner. At times we will be demoing workflows for instructors (whom I assume are familiar with R, RStudio, and the tidyverse and have taught with or are interested in teaching with Posit Cloud, Git, and GitHub) and at other times you will be working through the student view (on Posit Cloud, assuming you’re not using your local setup).
Watch
For the latter, you can mostly assume that you’re a student in an intro data science course and this is the first day of class (i.e. there’s no expectation of prep). However we’d also like to give you a taste of how the remainder of the semester goes, which often entails students watching background videos to prepare. To prepare for the first module, please watch the following video.
🖥️ Slides
Set up
For the former, you’ll might need to prepare a bit more. If you already are a Git and GitHub user, have a personal token saved, and can push to and pull from GitHub without having to type your password in each time, you’re good to go. If not, please complete the following steps before the workshop1. If you need help with any of the steps, please ask on GitHub Discussions.
Install
If you choose to use your local R and RStudio install for the workshop, you should also install the following packages:
Posit Cloud
You can join the Posit Cloud workspace for this workshop at pos.it/teach-ds-conf23-cloud.
Instructor
Dr. Mine Çetinkaya-Rundel (she/her) is Professor of the Practice at Duke University and Developer Educator at Posit. Mine’s work focuses on innovation in statistics and data science pedagogy, with an emphasis on computing, reproducible research, student-centered learning, and open-source education as well as pedagogical approaches for enhancing retention of women and under-represented minorities in STEM. Mine works on integrating computation into the undergraduate statistics curriculum, using reproducible research methodologies and analysis of real and complex datasets. Mine works on the OpenIntro project, whose mission is to make educational products that are free, transparent, and lower barriers to education. As part of this project she co-authored four open-source introductory statistics textbooks. She is also the creator and maintainer of datasciencebox.org, co-author of R for Data Science, and she teaches the popular Statistics with R MOOC on Coursera. Mine is a Fellow of the ASA and Elected Member of the ISI as well as the winner of the 2021 Robert V. Hogg Award for For Excellence in Teaching Introductory Statistics.
Footnotes
These steps will direct you to relevant chapters from Happy Git with R by Jenny Bryan et. al. It’s such a great and comprehensive resource that I have not felt the need to replicate the information elsewhere.↩︎